The Box of No Return: To Wield Yoshimi the Great

Yoshimi is a very special software synthesizer. It is a significant rewrite and variation with new directions, after a fork from the ZynAddSubFX project, which itself is deserving of tremendous congratulation. Both synthesize tone signals on the fly, in highly manipulable fashion. Will Godfrey, Linux musician extraordinaire, has contributed:

A car analogy.

A sample player is a drive along a straight, wide, almost new highway with only 2 other cars in sight, on a lightly overcast summer's day in a Ford Fiesta at around 40 MPH.

Yoshimi is a white-knuckle trip over a Swiss mountain pass in a blizzard, at night, facing donkeys, trucks and bandits, while driving an open-frame kit car doing 90 +

In recent times we've been able to dispose of the donkeys, and the bandits are on the run :)

This is not to be construed as a dun of sampling and sample players. These are very helpful indeed, and quite essential, for delivery of many tones and tone colors, and this writer uses Fluidsynth for many purposes.

But there is tremendous power and flexibility inherent in live signal generation. With power comes the need for care ☺ Without, one may find one's tires meeting gravel and even brick walls, fairly quickly. Here are items which can help.

avoid distro burden

Automatic USB flash drive mounting can eat USB MIDI and audio performance and cause xruns (Jack audio and/or ALSA MIDI hiccups). There are other background things also, and some of them are very hard to find. Excellent results will obtain when your distro makes it easy for you to use only the tools you need. More on this in Choosing a Linux Platform for Live Synth.

prevent and eliminate overdrive

Overdriving can be a real problem. Just like in the world of real wires, preamps, and amplifiers, Yoshimi and other software tools can easily and inadvertently be set up to produce a digital “signal” which will overdrive whatever it is connected to in its little software universe, producing anything from distortion, to xruns -- momentary overload and hesitation of the JACK audio-data-carrying infrastructure -- to broad crashes in the very extreme case.

reverb

The most common cause of overdrive in my experiences, has been reverb. In the digital realm, "reverberation" is not a single algorithmic modification of a single stream of data: it is a most carefully calculated addition and refolding-back of input signals, in deliberate emulation of real echoes. So more and more and more data piles on. If you need a really serious reverberation for your goal, you will have to dedicate a lot of your system power to it: if I wanted to go extreme with reverb I would probably find a way to dedicate a whole MultiJACK soft server to it and redesign/reroute accordingly. On the other hand, it might be better on the whole to just add a guitar-style stomp box outside of the synthesizer altogether, the guitar folks really do have their tools working well these days.

stream mixing

The other very common cause of overdrive I have had, is overload due to mixing of streams. The patch I use most, the Supermega Rumblic Organ, uses three independent Yoshimis each running two different patches, which produce three separate JACK output streams, all of which have to get crammed into the input of one triple stack of Calf filters, and thence towards the audio hardware. Overload occurred a lot. Eventually it occurred to me that the analogy of real copper might apply: if one takes the output of three physical keyboards being played at the same time and jams them into a triple Y adapter and thence into one amplifier, one will not be pleased with the result. So I tried several software mixers. I was not happy with their behavior, until I tried the marvelous Non-Mixer by Jonathan Moore Liles. Very nice, very easy to set it up with exactly the inputs and outputs one needs, and all of them wire up with JACK. Smooth, beautiful, controllable!

an audio filter called compressor

In addition to mixing, I also use an audio filter called a compressor, as the last item in my Calf chain. This helps keep things under yet more control.

When one applies an audio compressor, if the range of volume of a given signal runs too soft or too loud or both, the device will sense it, and change (compress) the amplitude range, to that which is appropriate for the particular PA, particular venue, particular audience, et cetera. It changes the range gently, in a controllable curve, so that it is not perceived as other than a deep part of the moment.

For quite a while, also, I have been using the multiband Calf compressor. This is because I find that I sometimes get overdrive in only certain areas of the audio spectrum, so I keep just those areas under maximum control. The whole idea, after all, is to produce the most profound, edgy, soul-driving tonage we can right? So clearly we don't want to "tone down" anything we don't have to tone down!

special to the digital realm

As is discussed in reverb above, the digital realm is different in many ways from the analogue. There are many more ways this plays out.

Most Linux audio tools are quite functional with lower-powered CPUs, and yet maximum potential is far out of sight even with big server-class hardware. For instance, one instance of Yoshimi, can be configured to (try to) deliver sixteen different CPU-intensive tones at once, from a single MIDI command. But on my AMD 8-core, I run three Yoshimis at once sometimes, because this distributes the load among all those CPUs. The result of careful design of this kind, is the most profound tonality I have ever had available to my hands.

The multi-CPU tool htop, which has a powerful full-screen text UI, can be very helpful to figure out what is going on. It is worthwhile to realize that sometimes your Yoshimis will not be taking all of your CPU, but you'll still get xruns and overload conditions; this is likely the overdrive mentioned above. A VU meter applet placed in your Jackd setup in a location to receive everything being sent to the outputs, can help verify this.

Here are a few methods of approach.

The first is multi-Yoshimi. The last time I tried a multi-ZAFSX approach, the result was a total hardware lockup; not a CPU limitation, but something much worse. The fact that multi-Yoshimi works so well, is one of its many joys. Use htop, and if your single Yoshimi is using most of one CPU core, try two or three separate instances.

The second is polyphony. In Yoshimi's main screen there is a setting for maximum simultaneous notes. Try reducing. But not too much. There is a lot of wiggle room in other places.

If you use reverb within Yoshimi patches – and many have them within – you may find good reason to change things according to the readouts of htop. I have found it much more efficient, to apply reverb and other filters using Calf, as separate JACK process elements, “after the fact” if you will. My own semistandard approach, is to use zero reverb in my Yoshimis, and then pipe them all to a stack of Calf plugins, starting with EQ, then Reverb, then Multiband Compression.

when the numbers don't add up: MultiJACK

The current BNR does not route all audio data through one JACK process, as is usual at this writing. This is because despite quite a lot of work on the above items, I was seeing strange results: I was using 10-20% of my big 4GHz octocore CPU, and yet my one JACK process was showing 75% of its capacity used. There was a whole lot of advice given to just give up at this point, suggesting in effect that I just could not get more out of the one box than I was getting. Which struck me as ridiculous, although the advisers were primary sources! After about two and a half years, however, something called MultiJACK emerged, in which multiple JACK servers are connected by IP within the single box. This has eliminated the problem entirely.

The Box of No Return To Wield Yoshimi the Great