Quantum-Enhanced Machine Learning: A DIY guide (Part 5)

Quantum Generative Adversarial Networks (qGANs)

Nov 29, 2023

DALL-E rendering of the prompt “a quantum generative adversarial network”, courtesy of Bing

At long last, we return to the QML DIY guide. By way of a brief update, after the last article, I headed to my graduate school alma mater, the University of New Mexico, to attend the 25th annual SQuInT workshop, where I presented this poster. My first time in New Mexico since leaving 5 years ago to come to IBM Quantum, setting foot in the Albuquerque airport was a nostalgic moment.

(For those on the academic job market, there are some new positions available as part of the Quantum New Mexico Institute, a new joint institute between the University and Sandia National Labs. See here for details. Come to ABQ for quantum, and stay for the enchantment!)

Breakfast burritos and pinon-flavored coffee at the Albuquerque airport is highly-underrated, in my opinion!

With respect to writing these articles, I’m starting to realize I may be letting the perfect be the enemy of the good enough, which has slowed down my writing. Any advice/tips you have for how to navigate that would be appreciated!

Note: This article is part 5 in a series. You can find the inaugural article here.

With generative AI being quite the rage these days, this article seems timely. We’ll discuss what quantum generative adversarial networks (qGANs) are, some of their interesting uses, talk a little bit about the broader topic of using quantum computers to learn and model distributions, and see some examples of qGANs ‘in the wild’.

Before we get started though, it’s worth emphasizing that GANs are different from generative models such as foundation models or large language models (e.g., ChatGPT). Both kinds of models generate samples (or examples), but they do so differently. In short, GANs generate a sample ‘from whole cloth’ – you run the network once to get it – whereas foundation models or LLMs generate the next item in a sequence (e.g., string of text).

Why would quantum generative models matter?

You might be wondering for what reasons at all it might make sense to even think about using quantum for generative modeling tasks. (Setting aside the fact that quantum for generative modeling is nowhere near the level of sophistication that ChatGPT, DALL-E, etc. are!) There are a few lines of inquiry in the literature which show separations between what quantum generative models and purely-classical ones can achieve.

One line of work has been constructing problems where you can prove things about the relative power of quantum versus classical models. Two recent papers which help exemplify this work are:

On the Quantum versus Classical Learnability of Discrete Distributions: Constructs a generative modeling problem where, under certain generally-accepted assumptions about certain cryptographic primitives (“the decisional Diffie-Hellman (DDH) assumption for the group family of quadratic residues” for the experts), you can prove a quantum generative model can efficiently learn the underlying distribution, but a classical model cannot.
A super-polynomial quantum-classical separation for density modelling: Constructs a density modeling problem based on weak pseudo-random functions which has a provable speedup using quantum computers as compared to classical. (This work builds on the above work.) Note that here, density modeling means constructing a model which, given a sample, estimates the probability of that sample according to the underlying distribution.

As a small aside, density modeling and generative modeling are not the same task. But for the purposes of this article, including the above work seemed prudent. Both works do give proofs about what quantum and classical learners can do. These two works show there are problems where quantum models for either generative or density modeling have an advantage.

A second line of work has started from the perspective of quantum circuits, and asking whether there are quantum circuits which give rise to distributions which are provably hard to sample from classically. This line of work includes:

A Single T Gate Makes Distribution Learning Hard: T-gates are a special kind of gate (operation) in quantum circuits. Efficiently implementing these gates in a fault-tolerant manner is widely regarded as one of the most important milestones towards realizing fault-tolerant quantum computation. This work shows that for circuits whose depth scales polynomially with respect to its width, if there is a single T gate in the circuit, then performing density modeling of the output distribution of that circuit is hard. This is interesting because (a) given what is known about classically simulating such circuits, doing the simulation itself is quite easy (when the number of T gates is small), and (b) the result applies to both quantum and classical learners.
Learnability of the output distributions of local quantum circuits: Shows that for a family of circuits which are efficiently-simulable classically (namely, Clifford circuits), the output of those circuits is efficiently learnable classically for both the task of generative and density modeling.

So from the perspective of properties of quantum circuits, Clifford circuits – which are known to be efficiently-simulable classically – are also efficiently learnable, classically. In a way, this means that if the circuit you are considering for an application is a Clifford circuit, you really shouldn’t expect to have much formal hope for any kind of advantage: you can simulate them, efficiently generate samples from them, and efficiently estimate probabilities of samples from them. On the other hand, if you start to add T gates to the circuit, then even in regimes where classical simulation would be easy, it will be hard for classical models to learn the output distribution.

Though, as A Single T Gate Makes Distribution Learning Hard emphasizes, the hardness of learning is also applicable to quantum models. Hence, when constructing circuits to use in quantum generative models, your argument for why quantum can bring an advantage shouldn’t hinge on the existence of T gates in the circuit.

Having gotten a better sense of why quantum generative modeling might matter, let’s turn to the topic of this article; namely, quantum generative adversarial networks (qGANs).

What are qGANs?

To understand qGANs, let’s first take a look at what a classical GAN is. The idea of these models is to pit two neural networks against one another. Given a training data set, one network (the generator) is trying to generate ‘fake’ examples which are similar to the real data. The input to the generator is some random variable. That is, the generator transforms the random input into a random output, where the random output is intended to mimic, as closely as possible, samples from the real data set.

Those examples (or real data itself) are fed to the other network (the discriminator) which is trying to determine whether the input is ‘real’ or ‘fake’. The way the training proceeds, the generator is rewarded for generating fake examples which become increasingly hard for the discriminator to differentiate from the real ones, and the discriminator is rewarded for being able to more accurately tell apart real and fake data.

At the end of the training, what you then have is a generator where, by passing in random input, you (should) get an output which is representative of, or sufficiently-similar to, a sample from the training data set.

The structure of a generative adversarial network (GAN). A generator proposes data samples. These samples (along with samples from the training set) are fed to the discriminator, which is suppose to classify an input as real (from the training set) or fake (from the generator). By pitting these two models against one another in a zero-sum way, you can train the generator to generate fake samples which are essentially indistinguishable from the training data. NOTE: Image is Figure 1 of *Quantum Generative Adversarial Networks for Learning and Loading Random Distributions*.

Given the above, there are a couple of ways to introduce quantum. You could replace either the generator or the discriminator by a quantum model (i.e., a quantum neural network), or you could replace both. From what I’ve seen in the literature, people generally tend to make the generator quantum, and leave the discriminator classical. This makes sense to me, for two reasons:

Barring the ability to directly pipe unless you plan to pipe the quantum state prepared by the generator over to the discriminator, then you have to measure the qubits of the generator, resulting in a classical bitstring. Thus, since the output of the generator is classical, the discriminator might as well be.
Relatedly, unless the training data itself is in the form of quantum states, the input to the discriminator is going to be classical, so the output of the generator might as well be.

Phrased another way, making both the generator and discriminator quantum would make the most sense (at least to me!) only when the training data is a collection of quantum states, and you have the ability to transfer quantum states from the generator to the discriminator.

That said, there is one reason why you might want both the generator and discriminator to be quantum, even if you are using only classical data: the models themselves might have fewer parameters, which might make them easier to train. (As we’ll see below, one work ‘in the wild’ considers this, and finds comparable performance in their workflow when using a quantum-based discriminator.)

A useful reference for generative quantum modeling in general, and qGANs in particular, is this thesis from fellow IBMer Christa Zoufal in 2021. In addition, an earlier paper from Christa (subsequently incorporated into her thesis) goes into qGANs in quite a bit of detail. Finally, a lecture from her as part of the 2021 Qiskit Global Summer School is available here.

Finally, in the article on QNNs, we took a look at the ‘barren plateau’ phenomenon, wherein the cost function which is being optimized during training becomes flat (with respect to the parameters of the model) exponentially quickly. Similarly, qGANs also suffer from this phenomenon, though the issue of trainability hinges on whether the loss for the model is measured explicitly (demanding the model produce accurate estimates of the probabilities of the samples) or implicitly (demanding the model simply produce accurate samples). (See Trainability barriers and opportunities in quantum generative modeling.) That said, the literature on this topic seems to be less well-developed than that on the trainability of QNNs, so further work is definitely needed!

Similar to quantum neural networks, qGANs suffer from trainability issues. Depending on the loss function and circuit depth, the models may not be trainable. Here, explicit loss means the loss function depends on the probabilities, whereas implicit loss means the loss function depends on the samples. NOTE: Image is Table 1 of *Trainability barriers and opportunities in quantum generative modeling*

The barriers to sharing The Quantum Stack are much lower than training qGANs.

What do you use qGANs for?

Given the above description of GANs and how they are trained, we can see that the essential idea is to create a model (the generator) that can generate samples from whatever distribution underlies the training data set. (That is, we can view the training data set as a sample from some distribution, and the purpose of the generator is to be able to generate more samples according to that [unknown] distribution.)

Similarly, qGANs are useful for preparing quantum states which encode probability distributions over training data. You can use the generator part of the qGAN in 2 ways:

The generator generates quantum states according to the action of a quantum circuit. Ideally, that circuit is compact; that is, it is low depth. (See Quantum Generative Adversarial Networks for learning and loading random distributions.) Having efficient quantum circuits for preparing quantum states can be useful for algorithms such as amplitude estimation, which is the workhorse in, for example, financial risk modeling using quantum computers. So you could use the circuit representing the generator directly in those sorts of algorithms.
The output of the generator is a sample from the probability distribution the circuit is modeling. So you could use the sample directly in a generative AI application.

If you are designing quantum algorithms which would benefit from efficient encoding of probability distributions, then the first way of using qGANs might help you. And if you are simply interested in using quantum computers to generate samples from probability distributions, the second way might.

What have people been doing with qGANs?

We’ll close out this article with a few examples of qGANs ‘in the wild’.

GEO: Enhancing Combinatorial Optimization with Classical and Quantum Generative Models: Proposes a framework and method by which generative models can be used to take samples output from optimization solvers and generate new samples which might be more optimal. Though this work focuses more on purely-classical generative models inspired by quantum, than actually quantum models. That said, the framework would still be applicable.
Hybrid Quantum-Classical Generative Adversarial Network for High Resolution Image Generation: Explores image generation based on the standard MNIST (digits) and Fashion-MNIST (clothing) data sets.
Exploring the Advantages of Quantum Generative Adversarial Networks in Generative Chemistry: Considers using qGAN generators to propose molecular fragments for chemistry. This work is interesting because in addition to using a quantum-based generator, it also considers introducing a quantum-based discriminator.
Towards AutoQML: A Cloud-Based Automated Circuit Architecture Search Framework: This is an industry-oriented work from E.On, a German electric utility. This paper introduces a framework for constructing good circuits to use in models, and looks at qGANs for the purposes of generating samples for energy prices.
Quantum integration of elementary particle processes: In high-energy physics, there are a lot of calculus integrals which need to be calculated. This work uses qGANs to encode some simple probability distributions which are used in computing those integrals. Though, as the text notes, it does not then take the next step of using the qGAN as a subroutine in computing the integrals via amplitude amplification.
Quantum Generative Adversarial Networks For Anomaly Detection In High Energy Physics: As the title implies, this work looks at using qGANs to help detect anomalous events (here, meaning ‘beyond standard model physics’, not UFOs!) in high-energy physics experiments.

Wrap-Up: Quantum Generative Adversarial Networks (qGANs)

Quantum Generative Adversarial Networks (qGANs) are a class of quantum machine learning models which use a quantum computer to train a generative model. This model is based on a tunable quantum circuit. At the end of the training, the state prepared by that circuit can be used as a sub-routine in other quantum algorithms, or measured to produce samples from a desired probability distribution. Understanding the exact kinds of practical distributions for which qGANs would offer a non-trivial advantage is an open question. That said, qGANs have been explored in the context of a wide range of problems, including image generation, anomaly detection, and combinatorial optimization.

The Quantum Stack