The PFN spirit that we put in the required qualifications – “Qualified applicants must be familiar with all aspects of computer science”

Toru Nishikawa

2018-03-06 12:29:19

*Some of our guidelines for applicants have already been updated to make our true intent conveyed properly, based on the content of this blog.


Hello, this is Nishikawa, CEO of PFN.

I am going to write about one of our hiring requirements.

It is about the wording in the job section of our website used to describe one of the qualifications/requirements for researchers – “Researchers must be seitsu (精通, “familiar with” in Japanese) all aspects of computer science.” We have always had this requirement since the times of PFI because we truly believe in the importance of having a deep knowledge of not just one specific branch but various areas when doing research on computer science.

Take database research for example. It is essential to have thorough knowledge of not only the theory of transaction processing and relational algebra, but also storage and the computer architecture to run a database. Researchers are also required to know about computer networks, now that distributed databases have become common. In today’s deep-learning research, using just one computer could not produce competitive results, thus high-efficient parallel processing is a must. For creating a framework, to understand the structure of computer architecture and language processors is vital. When creating a domain-specific language, without understanding the programming language theory, you will easily end up making a language that looks like an annex added to a building as an afterthought. In reinforcement learning, it is important to refine simulation and rendering technologies.

In short, we live in an age when one who only knows about only one particular area can no longer have an advantage. Furthermore, it is difficult to know in advance which areas of computer science should be fused into generating new technology. In order to realize our mission, that is, to make a breakthrough with cutting-edge technologies, it is extremely important to strive to familiarize oneself with each and every branch of computer science.

This familiarity, a comprehensive knowledge and deep understanding in every field of computer science, is expressed by the Japanese word seitsu mentioned in the first paragraph. The word does not mean you can publish papers in top conferences – that would require not only seitsu but also the ability to conduct new groundbreaking research. (Being able to perform such research is a very important skill that we also need to acquire.) It also does not mean to “know everything” about each field. Someone who declares he knows everything is, rather, not a scientist.

The field of computer science is making rapid progress and we must always pursue its advancement. Sometimes I come across comments making fun of the passage “with all aspects of computer science” on social media, but the message we put into the job requirement has played an important role in shaping PFN culture and so it has remained to date. We will continue to stick to this principle. That said, we also understand the need to come up with an expression that is not misleading. The domain of computer science has been expanding rapidly over the past decade. This trend will no doubt continue to accelerate. New fields of study will emerge after combining many different fields within and outside of the computer science domain. Considering this, we should revise the employment condition in light of the following factors:


・It will become more important to absorb changes and progress that will be made in computer science and become acquainted with new fields that will come out in the future rather than being well-versed in all aspects of computer science at this point. (It is, of course, still necessary to have an extensive knowledge.) We will treat an applicant’s eagerness and passion for learning as more important than his current knowledge.

・We value an applicant’s forward-looking attitude toward deepening an understanding of not only computer science but also other fields such as software engineering, life science and mechanical engineering.

・We welcome not only experts in the artificial intelligence field but also specialists in various areas of expertise to make innovation by combining new technologies.


The criterion has been applied only to researchers, but I believe it is crucial for everyone to be united to open up a path to new technology with no distinction between researchers and engineers because researchers need to have some engineering knowledge while engineers need to make efforts to understand research as well. Therefore, we will make this a requirement for both researchers and engineers.

It is also an important duty for me to create a workplace in which all valuable PFN members can do their best to innovate and create new technology, which I will continue to actively work on.


PFN is looking for talented people with diverse expertise in various fields. If you are interested in working with us, please apply at the following link

We have released ChainerUI, a training visualizer and manager for Chainer


2017-12-20 10:58:31

We have released ChainerUI, to help visualize training results and manage training jobs.

Among Chainer users, there are demands to watch the progress of training jobs or to compare jobs by plotting the training loss, accuracy, and other logs of multiple runs. These tasks tend to be intricate because there were no suitable applications available. ChainerUI offers functions listed below in order to support your DNN training routine.

  • Visualizing training logs: plot values like loss and accuracy
  • Managing histories of training jobs with experimental conditions
  • Operating the training jobs: take a snapshot, modify hyper parameter like within training

ChainerUI consists of a web application and an extension module of Trainer in Chainer, which enables easy training. If you have already used the LogReport extension, you can watch training logs on web browser without any change. If you add other extensions of ChainerUI, more experimental conditions will be displayed on the table and the training job can be managed from ChainerUI.

Visualizing training logs

ChainerUI monitors training log file and plots values such as loss, accuracy, and so on. Users can choose which variables to plot on the chart.

Managing training jobs

ChainerUI’s web application shows the list of multiple training jobs on the result table with experimental conditions. In addition, you can take actions such as taking snapshot or modifying hyperparameters from the job control panel.

How to use

To install, please use the pip module, and then setup the ChainerUI database.

pip install chainerui
chainerui db create
chainerui db upgrade

Secondly, register a “project” and run server. The “project” is the repository or the directory that contains chainer based scripts.

chainerui project create -d PROJECT_DIR [-n PROJECT_NAME]
chainerui server

You can also call chainerui project create while the server is running.

Finally, open http://localhost:5000/ on web browser, and you are ready!!

Visualizing training logs

The standard LogReport extension included in Chainer exports “log” file. ChainerUI watches that “log” file and plots a chart automatically. The following code is an example to run MNIST example to show plotting with ChainerUI.

chainerui project create -d path/to/result -n mnist
python -o path/to/result/1

The “…result/1” result is added in “mnist” project. ChainerUI watches the “log” file updated in “path/to/result/1” continuously and plots values written in the file.

Managing training jobs

ChainerUI monitors the “args” file that is located in the same directory with the  “log” file, and shows its information on the results table as experimental conditions. The “args” file has key-value pairs in JSON style.
The below sample code shows how to save “args” file using ChainerUI’s utility function.

# [ChainerUI] import chainerui util function
from chainerui.utils import save_args

def main():
parser.add_argument('--out', '-o', default='result',
help='Directory to output the result')
args = parser.parse_args()

# [ChainerUI] save 'args' to show experimental conditions
save_args(args, args.out)

To operate training jobs, set CommandsExtension in the training script. This extension supports taking a snapshot and changing hyperparameters such as learning rate while running the training job.

# [ChainerUI] import CommandsExtension
from chainerui.extensions import CommandsExtension

def main():
trainer = training.Trainer(updater, (args.epoch, 'epoch'), out=args.out)

# [ChainerUI] enable to send commands from ChainerUI

To see whole code, examples/


ChainerUI was mainly developed by Inagaki-san and Kobayashi-san who participated in summer internship at Preferred Networks this year.

During the two months of their internship program, they specified user requirements and implemented a prototype. They have continued to contribute after the internship as part-time workers. They are proud to release their work as “ChainerUI.”

Future plan

ChainerUI is being developed under the Chainer organization. The future plan includes the following functions.

  • Output chart as image file
  • Add other extensions to operate training script, etc.

We are also hiring front-end engineers to work on such! We are looking forward to receiving your applications.

Release Chainer Chemistry: A library for Deep Learning in Biology and Chemistry

Kosuke Nakago

2017-12-18 11:40:20


* Japanese blog is also written here.

We released Chainer Chemistry, a Chainer [1] extension to train and run neural networks for tasks in biology and chemistry.

The library helps you to easily apply deep learning on molecular structures.

For example, you can apply machine learning on toxicity classification tasks or HOMO (highest occupied molecular orbital) level regression task with compound input.

The library was developed during the PFN 2017 summer internship, and part of the library has been implemented by an internship student, Hirotaka Akita at Kyoto University.


Supported features

Graph Convolutional Neural Network implementation

The recently proposed Graph Convolutional Network (Refer below for detail) opened the door to apply deep learning on “graph structure” input, and the Graph Convolution Networks are currently an active area of research. We implemented several Graph Convolution Network architectures, including the network introduced in this year’s paper.

The following models are implemented:

  • NFP: Neural Fingerprint [2, 3]
  • GGNN: Gated-Graph Neural Network [4, 3]
  • WeaveNet: Molecular Graph Convolutions [5, 3]
  • SchNet: A continuous-filter convolutional Neural Network [6]


Common data preprocessing/research dataset support

Various datasets can be used with a common interface with this library. Also, some research datasets can be downloaded automatically and preprocessed.

The following datasets are supported:

  • QM9 [7, 8]: dataset of organic molecular structures with up to nine C/O/N/F atoms and their computed physical property values. The values include HOMO/LUMO level and internal energy. The computation is B3LYP/6-31G level of quantum chemistry.
  • Tox21 [9]: dataset of toxicity measurements on 12 biological targets


Train/inference example code is available

We provide example code for training models and inference. You can easily try training/inference of the models implemented in this library for quick start.



In the new material discovery/drug discovery field, simulation of molecule behavior is important. When we need to take quantum effects into account with high precision, DFT (density functional theory) is widely used. However it requires a lot of computational resources especially for big molecules. It is difficult to apply simulation on many molecule structures.

There is a different approach from the machine learning field: learn the data measured/calculated in previous experiments, and predict the unexperimented molecule’s chemical property. The neural network may calculate the prediction faster than the quantum simulation.


Cited from “Neural Message Passing for Quantum Chemistry”, Justin et al.


An important question is how to deal with the input/output of compounds in order to apply deep learning. The main problem is that all molecular structures have variable numbers of atoms, represented as different graph structures, while conventional deep learning methods deal with a fixed size/structured input.

However “Graph Convolutional Neural Network” is proposed to deal with graph structure for input.


What is a Graph Convolutional Neural Network

Convolutional Neural Networks introduce “convolutional” layers which applies a kernel on local information in an image. It shows promising results on many image tasks, including classification, detection, segmentation, and even image generation tasks.

Graph Convolutional Neural Networks introduce a “graph convolution” operation which applies a kernel among the neighboring nodes on the graph, to deal with graph structure.


How graph convolutions work

CNN deals with an image as input, whereas Graph CNN can deal with a graph structure (molecule structure etc) as input.

Its application is not limited to molecule structure. “Graph structures” can appear in many other fields, including social networks, transportation etc, and the research of graph convolutional neural network applications is an interesting topic. For example, [10] applied graph convolution on image, [11] applied it on knowledge base, [12] applied it on traffic forecasting.


Target users

  1. Deep learning researchers
    This library provides latest Graph Convolutional Neural Network implementation
    Graph Convolution application is not limited to Biology & Chemistry, but various kinds of fields. We would like many people to use this library.
  2. Material/drug discovery researchers
    The library enables the user to build their own model to predict various kinds of chemical properties of a molecule.


Future plan

This library is still a beta version, and in active development. We would like to support the following features:

  • Provide pre-trained models for inference
  • Add more datasets
  • Implement more networks

We prepared a Tutorial to get started with this library, please try and let us know if you have any feedback.



[1] Tokui, S., Oono, K., Hido, S., & Clayton, J. (2015). Chainer: a next-generation open source framework for deep learning. In Proceedings of workshop on machine learning systems (LearningSys) in the twenty-ninth annual conference on neural information processing systems (NIPS) (Vol. 5).

[2] Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., & Adams, R. P. (2015). Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems (pp. 2224-2232).

[3] Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212.

[4] Li, Y., Tarlow, D., Brockschmidt, M., & Zemel, R. (2015). Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493.

[5] Kearnes, S., McCloskey, K., Berndl, M., Pande, V., & Riley, P. (2016). Molecular graph convolutions: moving beyond fingerprints. Journal of computer-aided molecular design, 30(8), 595-608.

[6] Kristof T. Schütt, Pieter-Jan Kindermans, Huziel E. Sauceda, Stefan Chmiela, Alexandre Tkatchenko, Klaus-Robert Müller (2017). SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. arXiv preprint arXiv:1706.08566

[7] L. Ruddigkeit, R. van Deursen, L. C. Blum, J.-L. Reymond, Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17, J. Chem. Inf. Model. 52, 2864–2875, 2012.

[8] R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules, Scientific Data 1, 140022, 2014.

[9] Huang R, Xia M, Nguyen D-T, Zhao T, Sakamuru S, Zhao J, Shahane SA, Rossoshek A and Simeonov A (2016) Tox21 Challenge to Build Predictive Models of Nuclear Receptor and Stress Response Pathways as Mediated by Exposure to Environmental Chemicals and Drugs. Front. Environ. Sci. 3:85. doi: 10.3389/fenvs.2015.00085

[10] Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst (2016), Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering, NIPS 2016.

[11] Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling (2017) Modeling Relational Data with Graph Convolutional Networks. arXiv preprint arXiv: 1703.06103

[12] Yaguang Li, Rose Yu, Cyrus Shahabi, Yan Liu (2017) Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv preprint arXiv: 1707.01926


MN-1: The GPU cluster behind 15-min ImageNet


2017-11-30 11:00:05

Preferred Networks, Inc. has completed ImageNet training in 15 minutes [1,2]. This is the fastest time to perform a 90-epoch ImageNet training ever achieved. Let me describe the MN-1 cluster used for this accomplishment.

Preferred Networks’ MN-1 cluster started operation this September [3]. It consists of 128 nodes with 8 NVIDIA P100 GPUs each, for 1024 GPUs in total. As each GPU unit has 4.7 TFLOPS in double precision floating point as its theoretical peak, the total theoretical peak capacity is more than 4.7 PFLOPS (including CPUs as well). The nodes are connected with two FDR Infiniband links (56Gbps x 2). PFN has exclusive use of the cluster, which is located in an NTT datacenter.

MN-1 in a Data center

MN-1 Cluster in an NTT Datacenter

On the TOP500 list published in this November, the MN-1 cluster is listed as the 91st most powerful supercomputer, with approx. 1.39PFLOPS maximum performance on the LINPACK benchmark[4]. Compared to traditional supercomputers, MN-1’s computation efficiency (28%) is not high. One of the performance bottlenecks is the interconnect. Unlike typical supercomputers, MN-1 is connected as a thin tree (compared to a fat tree). A group of sixteen nodes is connected to a pair of redundant infiniband switches. In the cluster, we have eight groups, and links between groups are aggregated in a redundant pair of infiniband switches. Thus, if a process needs to communicate with different group, the link between groups becomes a bottleneck, which lowers the LINPACK benchmark score.

Distributed Learning in ChainerMN

However, as stated at the beginning of this article, MN-1 can perform ultra-fast Deep Learning (DL). This is because ChainerMN does not require bottleneck-free communication for DL training. While training, ChainerMN collects and re-distributes parameter updates between all nodes. In the 15-minute trial, we used the ring allreduce algorithm. With the ring allreduce algorithm, nodes communicate with their adjacent node in the ring topology. The accumulation is performed on the first round, and the accumulated parameter update is distributed on the second round. Since we can make a ring without hitting the bottleneck on full duplex network, MN-1 cluster can efficiently finish the ImageNet training in 15 minutes with 1024 GPUs.

Scalability of ChainerMN up to 1024 GPUs





IROS 2017 Report


2017-11-06 10:30:04

Writers: Ryoma Kawajiri, Jethro Tan

Preferred Networks (PFN) attended the 30th IEEE/RSJ IROS conference held in Vancouver, Canada. IROS is known to be the second biggest robotics conference in the world after ICRA (see here for our report on this year’s ICRA) with 2797 total registrants, 2164 submitted papers (of which 970 were accepted amounting to an acceptance rate of 44.82%). With no less than 18 sessions being held in parallel, our members had a hard time to decide which ones to attend.

more »

2018 Intern Results at Preferred Networks (Part 1)


2017-10-18 07:44:24

This summer, Preferred Networks accepted a record number of interns in Tokyo from all over the world. They tackled challenging tasks around artificial intelligence together with PFN mentors. We appreciate their passion, focus, and designation to the internship.

In this post, we would like to share some of their great jobs (more to come).

more »

Guest blog with Weihua, a former intern at PFN


2017-09-11 16:29:13

This is a guest post in an interview style with Weihua Hu, a former intern at Preferred Networks last year from University of Tokyo, whose research has been extended after the internship and accepted at ICML 2017.

“Learning Discrete Representations via Information Maximizing Self-Augmented Training,” Weihua Hu, Takeru Miyato, Seiya Tokui, Eiichi Matsumoto, and Masashi Sugiyama; Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1558-1567, 2017. (Link)

more »

ACL 2017 Report

Yuta Kikuchi

2017-09-08 13:54:22

Writers: Yuta Kikuchi, Sosuke Kobayashi

Preferred Networks (PFN) attended the 55th Annual Meeting of the
Association for Computational Linguistics (ACL 2017) in Vancouver, Canada. ACL is one of the largest conferences in the Natural Language Processing (NLP) field.

As in other Machine Learning research fields, use of deep learning in NLP is increasing. The most popular topic in NLP deep learning is sequence-to-sequence learning tasks. This model receives a sequence of discrete symbols (words) and learns to output a correct sequence conditioned by the input.

IMG_4383 (1)

more »

Deep Reinforcement Learning Bootcamp: Event Report


2017-09-04 14:49:39

Preferred Networks proudly sponsored an exciting two-day event, Deep Reinforcement Learning Bootcamp, which was held August 26-27th at UC Berkeley.

The instructors of this event included famous researchers in this field, such as Vlad Mnih (DeepMind, creator of DQN), Pieter Abbeel (OpenAI/UC Berkeley), Sergey Levine (Google Brain/UC Berkeley), Andrej Karpathy (Tesla, head of AI), John Schulman (OpenAI) and up-and-coming researchers such as Chelsea Finn, Rocky Duan, and Peter Chen (UC Berkeley).

[Taken from the event page]

more »

ICML 2017 Report

Brian Vogel

2017-08-25 10:09:00

Preferred Networks (PFN) attended the International Conference on Machine Learning (ICML) in Sydney, Australia. The first ICML was held in 1980 in Pittsburgh, last year’s conference was in New York, and the 2018 ICML will be held in Stockholm. ICML is one of the largest machine learning conferences, with approximately 2400 people attending this year. There were 434 accepted submissions spanning nearly all areas of machine learning.

more »

Page 1 of 212