blog

The Blog

This post is also in the form of a video blog. Watch the video first!

The description below is longer then the one on YouTube, which passed the maximum 5000 character limit. The version here has all the full URLs and any extra comments I might add.

In this video, I explain how audio and sound works on Linux based comptuers and systems. More specifically, I go over the point of sound hardware, kernel drivers such as OSS and ALSA and userspace sound servers such as PulseAudio, Jack and Pipewire.

Along the way, I discuss the advantages and drawbacks of the current implementations, as well as why one implementation is often favored over another. Finally, I discuss the latest-and-greatest sound server, Pipewire, what it means, and how you can benefit from the improvements.

This video is a bit rambly at times, so please stick with me, and I hope you learn something throughout and feed your curiosity. Please feel free to use the timestamps below to skip between sections!

(grouped by topics)

Introduction

  • 00:00 - Introduction

The Hardware

  • 00:18 - Basic Hardware, Inputs and Outputs
  • 00:36 - Sound Cards (and what they do)
  • 01:01 - Digital Audio, PCM and extra hardware

Kernel Drivers

  • 01:29 - Kernel Drivers! (How to interact with hardware)
  • 01:53 - OSS (Open Sound System)
  • 02:12 - ALSA (Advanced Linux Sound Architecture)
  • 02:46 - ALSA Limitations - hardware mixing/multiplexing

Userspace Sound Servers

  • 03:54 - Pulseaudio (and sound servers)
  • 04:25 - Benefits of PA - mixing and resampling
  • 07:26 - Drawbacks of PA (and JACK introduction)
  • 08:13 - JACK and its benefits
  • 09:57 - Comparison with PA and other software

Pipewire (and ramble)

  • 11:12 - Pipewire (and its benefits)
  • 14:05 - Future of Pipewire
  • 15:17 - Note on Bluetooth (rant)

- note: mostly fixed!

  • 17:52 - Conclusion

Sound Cards

Check ALSA compatibility of a sound card

DAC and ADC

Nyquist Shannon sampling theorem

Chris Montgomery Videos

Kernel Driver Architecture

OSS

ALSA

Sound card multiplexing

Pulseaudio

Jack

Pipewire

Firewire

  • 0040 - When I say sound card, most computers have one build in these days, eg: onboard audio. Physical discrete cards are mostly a thing of the past.
  • 0250 - Sound card multiplexing also often called hardware mixing.
  • 1240 - There is also a “Pro Audio” mode for sound cards that splits all the channels
  • 1705 - Most of these disconnection issues are now fixed as of the time of publishing!
  • I'll add more notes as I remember when rewatching this.
  • Please note that due to classes and school and coop, the filming/editing/uploads of my videos are very delayed, and might not be the most sensitive. This video was filmed April 2021, Edited June-July 2021, Description written August 2021. I hate writing descriptions and thumbnails…

Watch this video on Peertube: https://peertube.tonytascioglu.com More info is probably on my wiki: https://wiki.tonytascioglu.com

Copyright 2021 - Tony Tascioglu I'm making this freely available under a CC-BY-SA-NC.

Email: tonytash@pm.me (not monitored 24/7) I might not get to comments on this video until the end of my next school/work term, feel free to post anyways.

I hope you enjoyed the video and learned something!

Shoutouts

Randy MacLeod (and the rest of the Wind River Linux userspace team). I know you had asked me about Pipewire at some point, and I already had this video in the works, so hopefully you find it useful :)

  • I'll update this as corrections are pointed out.
2021-08-19 17:33 · Tony

As someone who sometimes tinkers with radios, I sometimes just scroll through channels with a cheap r820t2 dongle and gqrx.

Spectrum

The latest thing I noticed was bars of what is presumably digital signals alongside of some commercial FM radio stations.

(todo: add screenshot here)

I questioned whether it was interference, but ruled that out since it wouldn't be in such a clean pattern, and the very consistent bandwidth is very reminiscent of digital radio signals.

Digital Radio

At the time, the only digital radio standard I knew of was DAB, which is used in some parts of Europe.

Turns out, there's an (IMO crappier) standard for digital radio used in North America called HD Radio.

Proprietary

Unlike the rather open DAB standard, HD Radio is proprietary, which is enough for me to hate it already.

This means that if you even want to receive or broadcast HDr, you'll probably get hit with hefty fines.

It's also going to (IMO) slow the rate of adoption, since if you're going to spend so much to upgrade a station, who is going to listen to it?

Needs new $$$ receivers

The reason AM/FM radio still works and is kind-of used in the 21st century is that it's ridiculously simple to receive (more so for AM then FM), and for cheap.

Digital radio already has a major hurdle there. If I'm going to spend $100 on a new radio, that's not going to happen *(unless it's for ham radio).

Practically everyone has an AM/FM receiver in their house, probably in an old CD player, boombox or car. Why would someone spend that much money on a new radio?

At that point, you can just pay for Deezer/Spotify.

I mean, I get why it's more expensive, just like DAB, you need what are basically low power CPUs, since you're decoding a digital compressed audio stream.

AAC vs MP2

Adding to the cost and proprietary nature is that HD Radio uses what basically amounts to an AAC audio stream.

Now, at least HE-AAC isn't terrible at low bandwiths, and can at least usually do closer to fullband?

DAB for comparison uses MP2. Some users may be familiar with MP2 as some DVDs used it instead of AC3 for the audio track.

As a pro to HDR vs DAB, MP3, like MP3, is a rather old codec. I find it's only usable at ~ >192 kbit/s, but suffers far worse then AAC at low bitrates.

This means that theoretically at least, a 64 kb/s AAC stream is better then MP2.

On the other hand, AAC is very very patent encumbered.

MP2 has the advantage that all of it's corresponding patents have since expired so anyone can use hardware/software for MP2 without nasty royalties.

AAC is expensive to implement and I guess just hope to not get sued if you're using it.

OPUS really is ideal here - royalty free, which means you can get more widespread adoption, and it crushes the other codecs at low bitrates.

Speaking of sound at low bitrates:

Limited stream bandwidth

From what I've seen, it's also limited to ~96 kbit/s transmissions with up to 4 streams.

This means that stations need to allocate that limited digital bandwidth among all the content they want to stream.

I've seen some stations do a main stream at 64 kbit/s and a secondary at 32, and other stations that do 32/32/32. I've also seen 48/16/16.

That's not a lot of bits to go around.

I have yet to see a station use all four and I'm sure it would sound terrible if they did.

As mentioned above, AAC really starts to break down at 64 kbits and below.

HD? More like compression

When I first read the name “HD Radio” I assumed it was some form to do lossless or high bandwidth audio signals.

“HD”. What a joke that was.

True HD radio might be neat, since while FM doesn't have (digital) compression artifacts-, the volume levels are compressed with that is probably a brick-wall compressor–.

-except the few stations where I can clearly tell that they are just broadcasting off a s—y 64 kbit/s mp3 stream…

–ie: but a multiband compressor, and crank the ratio dial to the right.

I mean, I can kinda see why. You need to get a high SNR so people further away can actually hear your stream, but don't want to pass the max modulation limit so the FCC doesn't come knocking.

Digital on the other hand is either all there or none, so you could get much higher sound quality.

Digital TV with MPEG-2 basic compression allowed us to send 720p60 and 1080i60 signals after all, which was a huge step up from NTSC/ATSC*.

*That's also because video can be compressed digitally more then audio generally but the point stands. no random noise in a DTV signal.

Anyways, back to my point. HD-Radio uses up to what I counted to be around 100 kbit/s of audio.

Even with one stream, that would sound bad with mp2 and mp3, usable with vorbis and aac, and pretty good with opus.

But of course, stations like having a substream so most of them use 64 kbit/s or even 32 kbit/s for their main stream.

As advanced as AAC is, it still sounds terrible at 32 kbit/s. Even at 64, it sounds noticeably worse then the FM counterpart.

I heard a talk news AM station in a sub-station at 16 kbit/s, I guess it's usable for voice, but still sounds terrible.

For comparison, OPUS is also fine at around 64 kbit/s. You can hear the compression, but it's bearable, but it also starts to get worse around 32 for stereo music.

HD Radio sounds way worse overall then FM though. Bonus points for feeding the volume compressed-af signal to the digital stream too, even though it's unnecessary. Back to the loudness wars I guess.

Bandwidth usage

HD-Radio signals take up a lot of space.

I guess we're lucky here is NA that we don't have many stations in urban areas.

In Europe, most stations only get 100 kHz spacing, and at most 200 kHz.

When I was in Istanbul, they had over 100 stations in the city on all even numbers (100.0, 100.2, so on)

(they also had some questionable stations with questionable modulation on some off numbers interfering with everyone else…)

Meanwhile, in Toronto, forget about 200 kHz, from my room, I can only pick up ~15 stations in total. We have the space here to do a digital stream beside fm stations.

What I'm trying to say is that in Europe, allocating almost 100 kHz on the sides of the station will be impossible, since other stations are there.

Compared to DAB, HDr takes up more bandwidth from what I can tell. You'd probably need to use a different part of the spectrum.

Note: You can also do what Norway did and I guess just kill off FM and switch to DAB. I don't recommend this approach.

See my notes above, namely the $$$ receivers part for why.

Why would I listen to it

If I wanted to hear a crappy 64 kbit/s AAC audio stream. I'd just press the listen online button on their website.

Surprise! Most people already have a computer/phone with an internet connection that can play those streams.

Not to mention, by the time I open a bad sounding HDR stream, I'd just play the playlist I WANT, WITHOUT ADS from Deezer/Spotify.

I only put up with ads on FM since it's free to listen to, and FM radios are basically free most of the time.

I ain't paying 100$ to listen to ads at 64kbit/s where I don't even get to pick the music, and there are only 4 stations.

Even if a digital radio was free, I think it would gain limited adoption in this day and age.

There is a program on GitHub that uses some reverse engineering to play the modified AAC streams.

You can find it here: https://github.com/theori-io/nrsc5

There's also a Python GUI should you prefer here: https://github.com/cmnybo/nrsc5-gui

It works on the cheap R820t2 receivers without many issues (except for the garbage antenna not receiving anything useful)

I guess the GUI can also show a traffic map and weather radar, if the stations bother broadcasting that. but again, anyone with a receiver fancy enough to show a traffic map probably has a freaking cellphone with google maps that also acts as navigation.

Maybe I'm acting like an old-person here, but the simplicity (and cheapness) of AM/FM made is so prevalent. The fact that it was so prevalent was what made it good.

Need to reach a city in an emergency? Chances are, 70% of them have at least a crappy radio.

Nobody (that I know of) would buy an HD Radio receiver (or DAB receiver for that matter). By the time you start putting a CPU and everything, that's already a basic computer.

For that money, why listed to compressed radio with ADS when you can just fire up a streaming service.

Thanks for reading. Leave any comments/questions down below.

2023-11-07 16:48

Many people are switching over to M.2 NVMe drives. NVMe has been supported in the Linux kernel for quite some time. The Sandisk WD SN550 is among the cheaper M.2 drives, without a DRAM cache but utilizing a custom SanDiskWD controller and NAND flash.

The problem is that earlier (or maybe all) of these drives have a major problem with the Linux kernel. I also don't know if this is limited to the SN 550, or if their other product lines are affected as well.

The SN550 will randomly time out (?) and you will be left unable to access anything on the system. Running any command in a terminal just returns “I/O Error”. You can't debug from logs, since there is no drive anymore to log the errors to. It seems like the entire file system unmounts itself as though you unplugged it.

Interestingly, this never happened using a PCIe 3.0 x4 to M.2 adapter in my desktop but happens very frequently on my ThinkPad E495 (not with the original SkHynix drive, but after I replaced it with the sn550 for extra capacity). It has happened on both Fedora and Arch Linux so far, and causes the whole system to lock up anywhere from a minute to a day after booting.

Thanks to the ArchLinux forums, I found a kernel parameter that has solved this problem for me.

You need to add the following to your kernel parameters at boot:

nvme_core.default_ps_max_latency_us=5500

If you use GRUB, add it to your

GRUB_CMDLINE_LINUX_DEFAULT="nvme_core.default_ps_max_latency_us=5500"

variable, separated by a space from any other parameters you have.

On Arch, this is found in

/etc/default/grub

Make sure you do

sudo grub-mkconfig -o /your-efi-path/grub.cfg

to generate the new GRUB configuration! The process is similar on other bootloaders such as rEFInd, just add it to your list of other kernel parameters. You can do that directly in /path-to-efi/refind.conf (or whatever you named it).

There are many instructions to add kernel parameters on the ArchWiki.

PS: The reason I call it Sandisk is that the NAND flash is still made by Sandisk, but as far as I know, WD acquired Sandisk a few years ago, and sells many SSDs under the WD brand name instead. Similar to how Toshiba is now rebranded as Kioxia.

Now I just need to fix the stupid AMDGPU Freeze error that locks up my system every other day…

2021-06-11 19:42 · Tony

Please note: This is one of my earlier posts from my blog (Jan 2019), and some stuff is missing from it, but it's here as a reference. I also need to break apart the long sections.

A common point of confusion on GNU/Linux systems is the display manager vs desktop environment, and what they both do, and how they all fit in with xorg or Wayland.

When the computer starts up, GNU/Linux only ships with a terminal (tty1-tty6). On top of the terminal, we run a graphics server. Historically, this has been Xorg X11, which came out some ~30 years ago. Newer systems occasionally use Wayland, but support is still lacking for many configurations (such as NVidia cards). Now, X11 just supports “Screens”, mice and keyboard inputs and outputs. Mind you, what X11 considers a “screen” is not just a monitor. I’ll go more into this in another post. Most importantly, X11 lets us have graphics, and not just text in a terminal.

Once X11 starts, all we have is an empty screen with a (blank) cursor. We need what we call a Display Manager to actually manage the display. The display manager (DM) starts with your computer (through systemd) and is what pops up the “login” screen, handles which Desktop Environment to load, session management and on some systems, locking the system.

The desktop environment itself if what we see most frequently. It handles all the applications that are running, maximizing, minimizing, stacking windows and the works. It handles the ‘menu’ or task bar, launching applications and essentially is what we think of as the ‘gui’ of the system.

X, or commonly known as Xorg is the display server used in most GNU/Linux systems. It’s rather similar to VNC. It makes a local “desktop” and connects to it. X11 is called such as it is the 11th major revision of the X system.

X has support for “screens”, “graphics cards”, mice and keyboards. An X ‘screen’ is not simply a monitor, as if you have 2 or 3 ‘screens’, you can’t actually move stuff around! They are independent sessions. (I’ll go into some more detail in another post). To get around this limitation, we just make a single screen, with a resolution that spans all of your monitors. (Currently, this is done through xrandr in most computers, previously, Xinerama was pretty common).

If you simply start xorg by itself, you’ll just get an empty screen (checker pattern or black) with a crosspoint as a cursor. This is because there isn’t really anything running on that server. You can actually use the computer this way, manually launching programs from the terminal, its not very common (a common use of this is digital signage, where you don’t want any excess bloat – just the one application that needs to run in fullscreen) as you can’t resize or maximize programs. Instead, we use a layer of abstraction, the desktop manager.

Technically, you don’t need a display manager to run a system. You can directly launch your desktop environment and that would work, but it would essentially “automatically login” and it would be a pain to have 2 users logged on and such (and lock screen in some DEs). This is where the desktop manager comes in. The DM is the only program to get launched with Xorg. Its what prompts you to ‘login’ into a system.

Display managers handle multiple sessions, such as having 2 different desktop environments running (eg: GNOME on tty1, KDE on tty2 and i3 on tty3) together, or, multiple users or instances running at the same time. They also take care of launching and exiting the desktop environments.

Display managers have little functionality to them most of the time. Just manage sessions through logins and logouts. This is usually handled through PAM (pluggable authentication modules). What that does in short, is whenever someone needs to login (or call sudo, or change settings), it simply calls PAM, which decides whether to ask for a password, or fingerprint and such.

Common display managers include:

  • GDM (Gnome Display Manager) – Default on GNOME
  • SDDM (Simple Desktop Display Manager) – Default on KDE Plasma
  • LXDM (Lightweight X11 Display Manager) – Default on LXDE

Actually, you can use any combination of display manager and desktop environment. Personally, I currently use GDM with KDE on my laptop, since GDM handles PAM authentication for the fingerprint reader well, with KDE as my DE of choice. On my desktop, I use LXDM with KDE (or XFCE) due to the lightweight load.

You should also note that on most desktop environments, locking is handled through the DE, not the DM. On KDE, if I do ctrl+alt+L (lock hotkey), you’re prompted with a login wizard that looks like sddm. (Note that PAM is handled a bit differently: it uses system auth in this case instead of the login auth)

The desktop environment is the most (easily) visible component of the GUI. It handles the taskbar, menus, maximizing and minimizing applications, stacking applications (things on top of each other), desktop icons & wallpaper, launching most programs, WiFi connections, volumes, clipboards and such. It handles visual aspects as well, including icons, system colors and themes used in applications and other GTK/QT settings.

A lot of essential parts of the system ship with a desktop environment. Every (full) desktop environment ships with its own suite of applications for file browsing, settings, web browsing. As such, the same computer can look and behave vastly different among various computers. This is also due to desktop environments being based on either the GTK+ or QT toolkits.

For example, some common desktop environments:

  • GNOME (GTK3)
  • KDE (QT5)
  • XFCE (GTK2/3)

Desktop environments connect with a lot of other system services to perform functions, in a way that is easy for the user to interact with. For example, the volume sliders on most DE’s actually just send commands to PulseAudio, the network manager is usually handled by NetworkManager, and so on.

Essentially, most of what you interact with (outside of applications) is dependent on the DE you choose. Everything from settings, to file managers, to lockscreens, to menus, to icons are determined by the DE. SO MUCH of the look, feel, and functionality can change based on the DE. There is so much to cover that I will probably end up creating another post just on this.

Now even DE’s have a core, which actually handles the applications. There are 2 main types: stacking and tiling. Stacking is what most users are used to. Its used on Windows, MacOS and most environments. Its when open applications ‘stack’ on top of each other. If I open Firefox, then Thunderbird, Thunderbird will appear ‘on top’ of Firefox, and the keyboard will interact w/Thunderbird. If you click on Firefox, the focus switches, and FF becomes ‘on top’. Meanwhile, in tiling desktops (like i3), opening 2 apps opens them side by site, usually following a tree layout.

Common desktop environments include:

  • GNOME
  • KDE (now called Plasma)
  • LXDE (now LXQT is preferred)
  • XFCE
  • i3 (or Sway for Wayland)
  • MATE (like Gnome2)
  • Cinnamon
2021-05-01 01:41 · Tony

Please note: This is initially from my Blog, published December 2018! Some stuff has changed, but I'll keep this here as reference.

For the past several years, my system OS of choice has been GNU/Linux. I’ve gone from Ubuntu, to Debian, to Fedora, to OpenSUSE and finally, Arch. Through these years, I’ve also gone through many desktop environments, including Unity, GNOME, KDE, Xfce, i3, then back to KDE. This post start to outline my experiences with each.

Back in 2014, I built a desktop to accompany my Laptop and for the experience. Initially, I purchased a Windows license and installed Windows 8.1 through the installation DVD.

Windows was very predictable, I had used Windows 8 on my laptop for several years, and despite preferring Windows 7, it was relatively easy getting adjusted. I kept this setup until Fall 2015, when I decided to try out Linux on the desktop.

At that time, I started looking into various distributions, eventually settling on Ubuntu (14.04 at the time for LTS) or Linux Mint (17.3 if I remember correctly), as they were easy distros to get into for beginners with lots of help on forums such as AskUbuntu. I settled on Ubuntu as I didn’t want to use a derivative of a derivative system (Mint is based on Ubuntu which is based on Debian). I also noted that Ubuntu had good driver support. This probably wouldn’t have been a big issue at the time anyways, since it was running an Intel i3-4130 with the integrated graphics.

At that time, I didn’t want anything to do with configuring Xorg (or anything for that matter), or compiling software. I just wanted something that worked, and was easy to switch from Windows. That’s one of the areas where Ubuntu still shines. It is a very “mainstream” Linux distribution, providing easy usage for the masses. Driver support is good, the installer works fine, and the computer just works. It is a good starting point, as you get experience into installing programs from the terminal and such. Mind you, you still have to be interested in tech with with the ability to pick things up (like a terminal), so while I realize it’s not for everyone, Ubuntu definitely holds your hand.

Actually, at first, I was very impressed by the ease of software installation in Ubuntu. No need to go to the website, download some .exe installer, press ‘no’ a bunch of times to avoid crapware and the such. It worked, and installed all dependencies with few hiccups. I was equally impressed with the ease of updating software. Unlike Windows where one had to go to the manufacturer website and re-download and install the latest updates, all the programs could be upgraded all at once through 2 commands – a big improvement.

What wasn’t too nice was when you started to run into issues with apt. Either the versions were incompatible or you had to find .deb files to install with dpkg, I personally found that apt on Ubuntu broke a fair bit. In comparison, dnf on Fedora and pacman on Arch had hardly any software issues, not to mention, no need to backport newer versions on which lead to other conflicts. While I also didn’t find Unity appealing at all, I kept with it because it worked!

One of the biggest parts of the transition was now finding compatible software to run on both my Windows laptop that I used everyday for note taking at school, and the desktop which I used at home. Up until this point, I had simply been using Microsoft OneNote to take notes, which worked very well. In fact, I still haven’t found an equal alternative, though my current setup is a solid good enough IMO. The problem with OneNote was that it would not run properly on Ubuntu. I tried running all the versions of Office that I could with my student license, and had some luck installing Office 2010 (earliest the student edition worked for), but with limited support – such as no OneDrive sync which was crucial to my workflow back then. Let’s not even mention the broken fonts that came with Wine, and the issues of syncing notebook files that may be open on one computer with the other. (See my next post for my updated fix). I ended up continuing to use OneNote for the few months in that fall, simply resorting to using OneNote online when I had to use my desktop.

I took care of file syncing with Dropbox initially, then followed by Mega.nz, as they had native clients. While I don’t use either anymore (foreshadowing: I went to OwnCloud → NextCloud → Syncthing → rsync) they were easy to use and worked. It was part of easing into the new workflow. For reference, I had hardly used SSH up until that point.

Overall, I was pleased with the way the system was working. Stuff worked most of the time, and in the face of trouble, there was tons of forum support along the way, which definitely helped.

In conclusion, vanilla Ubuntu was the first stepping stone on my rabbit-hole journey. There is probably a LOT that I left out from this post, and I may just post them later on. Check back for the next part for my continuation on my note-taking and file management, as well as transitioning into the KDE desktop environment on Ubuntu.

2021-05-01 01:39 · Tony
  • blog.txt
  • Last modified: 2023-12-20 07:34
  • by Tony