Kosuke Nakago

2019-07-16 11:59:50

This post is contributed by Mr. Kaushalya Madhawa, who was an intern and a part-time engineer at PFN. Japanese version is available here.

In this post we introduce our recent paper “GraphNVP: An Invertible Flow Model for Generating Molecular Graphs“. Our code can be accessed from Github repo.

Molecule Generation

Discovery of new molecules with desirable pharmacological properties is a crucial problem in computational drug discovery. Traditionally, this task is performed by clinically synthesizing candidate chemical compounds and running experiments over them. However, due to the sheer size of chemical space, synthesizing molecules and extensively performing experiments on them is an extremely time consuming task. Instead of searching through the space of molecules with desirable properties, de novo drug design involves designing new chemical compounds with the properties that we are interested in.

Recent advancements in deep learning, especially deep generative models proved to be invaluable in de novo drug designing.

The choice of molecule representation

An important step in the application of deep learning on molecule generation is how chemical compounds are represented. Earlier models relied on a string-based representation named SMILES. RNN-based language models or Variational Autoencoders (VAE) are used to generate SMILES strings which are then converted to molecules. A major issue in using SMILES strings is that they are not robust to minor changes of a string, resulting in drastically different molecules although the corresponding SMILES strings are almost similar. These problems prompted recent researches to rely on more expressive graph representations of molecules. Therefore, this problem became to known as molecular graph generation.

A molecule is represented by an undirected graph, in which the atoms and bonds are represented nodes and edges respectively. The structure of a molecule is represented by an adjacency tensor \(A \) and a node feature matrix \(X\) is used to represent the type of atoms (e.g., Oxygen, Fluorine etc.). The molecule generation problem reduces to generation of graphs which can represent valid molecules. This is a problem in which deep generative models such as GANs or VAEs can be leveraged. We can classify previous work into two categories based on how they generate a graph. Some models generate molecular graphs sequentially such that nodes (atoms) and edges (bonds) are added in a step-by-step fashion. The alternative is straightforward, generate a graph in a single step in a similar manner to image generation models.

The importance of reversibility

A significant advantage of the invertible flow-based models is they perform precise likelihood maximization, unlike VAEs or GANs. We believe precise optimization is crucial in molecule generation for drugs as they are highly sensitive to a minor replacement of a single atom (node). An additional advantage of flow models is that, since they are invertible by design, perfect reconstruction is guaranteed and no time-consuming procedures are needed. Simply running the reverse step of the model on a latent vector results in a molecular graph. Moreover, the lack of an encoder in GAN models makes it challenging to manipulate the sample generation. For example, it is not straightforward to use a GAN model to generate molecules that are similar to a query molecule (e.g., lead optimization for drug discovery), while it is easy for flow-based models.

Our model

GraphNVP, our proposed model is shown above. GraphNVP is the first graph generation model based on the invertible flow which follows one-shot generation strategy. We introduce two latent representations, one for node assignments and another for the adjacency tensor, to capture the unknown distributions of the graph structure and its node assignments respectively. We use two new types of coupling layers: Adjacency Coupling and Node Feature Coupling for obtaining these two latent representations. During graph generation, first we generate an adjacency tensor and then the node feature tensor is generated using graph convolutional networks.

Qualitative results

We randomly select a molecule from the training set and encode it into a latent vector \(z_0\) using our proposed model. Then we choose two random axes which are orthogonal to each other. We decode latent points lying on a 2-dimensional grid spanned by those two axes and with \(z_0\) as the origin. The visualization below indicates that the learned latent space is smooth such that neighboring latent points correspond to molecules with minor variations.


Comments from mentors

We, Nakago and Ishiguro, were responsible for mentor of Kaushalya. We started this research from 2018 summer internship. The research of deep graph generative models are getting attention, and many kinds of models are suggested at that time. However the model with Flow was still not suggested, and we started this research based on suggestion from Kaushalya.

It is first time application for graph generation, and model tend to need deeper layers for neural network with flow which requires large computation resource. It took some time to complete the research, but it was glad that we could publish the paper as well as the code finally.

Many projects are running in PFN, not only in “Drug Discovery / Material Discovery” but also in various kinds of fields. Please check our job list if you get interested!

New HCI group + upcoming papers and demos at UIST and ISS 2018

Fabrice Matulic

2018-10-15 09:11:17

Creation of HCI group

At PFN, we aspire to create next-generation “intelligent” systems and services, powered by cutting-edge AI technology, but we also recognise that humans will remain essential actors in the design and usage of such systems and therefore it is paramount to think about how the dialogue occurs. Human-Computer Interaction (HCI) approaches, which focus on bridging the gap between people and machines, can considerably contribute to improving intricate machine-learning processes requiring human intervention. With the creation of a dedicated HCI group at PFN, we aim to advance user-centred design for AI and machines and make sure the “humans in the loop” are supported with powerful tools when working with such systems.

Broadly, there are three main lines of research that the team would like to pursue:

  • HCI for machine learning: Utilise HCI methods to facilitate complex or tedious machine-learning processes in which people are involved (such as data gathering, labelling, pre-processing, augmentation; neural network engineering, deployment, and management)
  • Machine-learning for HCI: Use deep learning to enhance existing or enable new interaction techniques (e.g. advanced gesture recognition, activity recognition, multimodal input, sensor fusion, embodied interaction, collaboration between AI, robots and humans, generative model to create interactive content etc.)
  • Human-Robot Interaction (HRI): Make communication and interaction between smart robots and their users natural, intuitive and hopefully even fun!

Of course, HCI does not necessarily involve machine learning or robots and we are also generally interested in creating novel and exciting interactive experiences.

The HCI group will benefit from the expertise of Prof. Takeo Igarashi, of The University of Tokyo, who has been hired as an external consultant. In addition to his wide experience in HCI and HRI, Prof. Igarashi has recently started a JST CREST project on “HCI for machine learning” at his lab, which very much aligns with our research interests. We look forward to a long and fruitful collaboration.

Papers and demos at UIST and ISS 2018

Although the group was just officially created, we have been active in HCI research for the past months already and we will present two papers on recent work, respectively at UIST this week and ISS next month.

The first project, which was started at the University of Waterloo with Drini Cami and Prof. Dan Vogel, proposes to use different ways of holding a stylus pen while writing on a tablet to trigger different UI actions. The technique uses machine learning on the raw touch input data to detect these different pen grips when the user contacts the surface with the hand. The advantage of our technique is that it allows to rapidly switch between various pen modes using the same hand that writes and without resorting to cumbersome UI widgets.

In addition to the paper presentation, Drini will also be showing the technique at UIST’s popular demo session.

Our second contribution is the interactive projection mapping system for PaintsChainer that we showed at the Winter Comiket last year. For those of you who missed it, ColourAIze (which is how we call it in the paper) works directly with drawings and art on paper. Specifically, it projects colour fills determined by PaintsChainer directly onto the paper drawing with the colouring superimposed on the line art. Like with the web version of PaintsChainer, the ability to specify local colour hints to influence the colourisation is supported through simple (digital) pen strokes.

As with the pen-posture project above, we will both present our paper and do a demo of the system at the conference. If you’d like to try the fun experience of having your paper sketches, drawings and mangas coloured by AI, come and see us at ISS in Tokyo in November!

Last but not least, we are looking for talented HCI researchers to join our team, so if you think you can contribute in the areas mentioned above, please check the details of the position on our jobs page and apply!


CHI 2018 and PacificVis 2018

Fabrice Matulic

2018-05-08 12:02:31

This is Fabrice, Human Computer Interaction (HCI) researcher at PFN.

While automated systems based on deep neural networks are making rapid progress, it is important not to neglect the human factors involved in those processes, an aspect that is frequently referred to as “human in the loop”. In this respect, the HCI research community is well positioned to not only utilise advanced machine learning techniques as tools to create novel user-centred applications, but also to contribute approaches to facilitate the introduction, use and management of those complex tools. The information visualisation (InfoVis) community has started to shed some light into the “black box” of deep neural networks by proposing visualisations and user interfaces that help practitioners better understand what is happening inside it. PFN is closely following what is going on in HCI and InfoVis/Visual Analytics research and also aims to contribute in those areas.


The 11th IEEE Pacific Visualization Symposium (PacificVis 2018), which PFN sponsored and attended, was held in Kobe in April. Machine learning was well covered with several contributions in that area, including the first keynote by Prof. Shixia Liu of Tsinghua University on “Explainable Machine Learning” and the best paper “GANViz: A Visual Analytics Approach to Understand the Adversarial Game“, which followed in the footsteps of the best paper of IEEE VIS’17 about a visual analytics system for TensorFlow. Those contributions are closely related to Explainable Artificial Intelligence (XAI), an effort to produce machine learning techniques based on explainable models and interfaces that can help users understand and interpret why and how automated systems come to particular decisions or results. Whether those algorithms and tools will be sufficient to fulfil the right to explanation of the EU’s new General Data Protection Regulation (GDPR) remains to be seen.



The ACM Conference on Human Factors in Computing Systems (CHI) is the premier international conference of Human-Computer Interaction. This year it took place in Montreal, Canada, with attendance exceeding 3300 participants and an official welcome letter by Prime Minister Justin Trudeau.

A common use of machine learning in HCI is to detect or recognise patterns from complex sensor data in order to realise novel interaction techniques, e.g. palm contact detection from raw touch data, handwriting recognition using pen tip motion and writing sound. With the wide availability of deep learning frameworks, HCI researchers have integrated those new tools in their arsenal to increase the recognition performance for previous techniques or to create entirely new ones, which would have been ineffective or difficult to realise using old methods. Good examples of the latter are systems enabled by generative nets. For instance, DeepWriting is a deep generative model that can generate handwriting from typeset text and even beautify or mimic handwriting styles. ExtVision, which is inspired by IllumiRoom, automatically generates peripheral images using conditional adversarial nets instead of using actual content.

Aksan, E., Pece, F. and Hilliges, O. DeepWriting: Making Digital Ink Editable via Deep Generative Modeling. Code made available on Github.

Two other categories of applications of machine learning that we increasingly see in HCI are for interaction prediction and emotional state estimation. In the former category, Li, Bengio (Samy) and Bailly investigated how DNNs can predict human performance in interaction tasks using the example of vertical menu selection. For emotion and state recognition, in addition to an introductory course by Lex Fridman from MIT on “deep learning for understanding the human”, two papers about estimating cognitive load from eye pupil movements in videos and EEG signals were presented. With the non-stopping proliferation of sensors in mobile and wearable devices, we are bound to see more and more “smart” systems that seek to better understand people and anticipate their moves, for good or bad.

CHI also includes many vis contributions and this year was no exception. Of particular relevance for visual exploration of big data and DNN understanding was the work by Cavallo and Demiralp, who created a visual interaction framework to improve exploratory analysis of high-dimensional data using tools to navigate in a reduced dimension graph and observe how modifying the reduced data affects the initial dataset. The examples using autoencoders on MNIST and QuickDraw, where the user draws on input samples to see how results change, are particularly interesting.

Cavallo M, Demiralp Ç. A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration.

I should also mention DuetDraw, a prototype that allows users and AI to sketch collaboratively and which uses PaintsChainer!

Multiray: Multi-Finger Raycasting for Large Displays

My contribution to CHI this year was not related to machine learning. It involved interacting with remote displays using multiple rays emanating from the fingers. This work with Dan Vogel, which received an honourable mention, was done while I was at the University of Waterloo. The idea is to extend single-finger raycasting to multiple rays using two or more fingers in order to increase the interaction vocabulary, in particular through a number geometric shapes that users form with the projected points on the screen.

Matulic F, Vogel D. Multiray: Multi-Finger Raycasting for Large Displays

Final thoughts

So far, it is mostly the vis community that has tackled the challenge of opening up the black box of DNNs, but being focused on visualisation, many of the proposed tools have only limited interactive capabilities, especially when it comes to tweaking input and output data to understand how it affects the neurons of the inner layers. This is where HCI researchers need to step up and create the tools to support dynamic analysis of DNNs with possibilities to interactively make adjustments to the models. HCI approaches are also needed to improve the other processes of machine-learning pipelines in which humans are involved, such as data labelling, model selection and integration, data augmentation and generation etc. I think we can expect to see an increasing amount of work addressing those aspects at future CHIs and other HCI venues.

Guest blog with Weihua, a former intern at PFN


2017-09-11 16:29:13

This is a guest post in an interview style with Weihua Hu, a former intern at Preferred Networks last year from University of Tokyo, whose research has been extended after the internship and accepted at ICML 2017.

“Learning Discrete Representations via Information Maximizing Self-Augmented Training,” Weihua Hu, Takeru Miyato, Seiya Tokui, Eiichi Matsumoto, and Masashi Sugiyama; Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1558-1567, 2017. (Link)

more »

FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects


2017-07-25 10:01:28

Hello! My name is Hitoshi Kusano. I participated in 2016 PFN summer internship program and have been a part-time engineer at PFN ever since, while also studying machine learning at Kyoto University. At PFN, my research topic was to teach an industrial robot how to pick up an object.

Based on the result of my work at PFN, I presented a paper titled ”FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects” at the Learning and Control for Autonomous Manipulation Systems: The Role of Dimensionality Reduction workshop at ICRA 2017, the world’s biggest annual conference in robotics.

more »

ICLR 2017: Conference Report & Coming ICML


2017-07-21 11:20:57

I am Takeru Miyato, a researcher at Preferred Networks (PFN), and I participated in  ICLR 2017 (4/24-4/26), which is the biggest conference on deep learning research.

Let me give you a brief overview of the event. ICLR has been held since 2013 and this was the fifth ICLR. The main features of ICLR are:

  • Focus on deep learning and its application. Most of the papers focus on neural networks.
  • Adaptation of open review system. Everyone can join the review process. To be precise, everyone can see the all of the reviews and rebuttals and also can comment, ask questions, and post his or her reviews as public reviews. In addition, authors can update their paper anytime from the feedback until the end of the discussion phase.

As far as I know, there is no other conference exposing the all of the reviews and rebuttals to the public, which I think is interesting / helpful to the people who write or review research papers. Also, some people analyzed the submissions and reviews, and they posted articles with interesting results. Here are some links to a pair of interesting ones:

more »