Tuesday, February 15, 2022

Gotcha: When Windows and WSL2 Stop Talking To Each Other

When Linux and Windows Can Talk

Let's say I'm in an Ubuntu WSL2 session and I want to copy a file to Google Drive. The latest version of Google Drive allows access via a mounted drive, in my case, G:\. The simplest way I've found to copy a file into G:\ is to make use of PowerShell:

$ date > data.txt
$ powershell.exe -C copy "c:/Users/benji/Downloads/data.txt" "'G:\My Drive\'"
$ powershell.exe -C dir "'G:\My Drive\data.txt'"


    Directory: G:\My Drive


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
------         2/14/2022   6:13 AM             29 data.txt


$ powershell.exe -C type "'G:\My Drive\data.txt'"
Mon Feb 14 06:13:56 EST 2022
$ cat data.txt
Mon Feb 14 06:13:56 EST 2022
$

While the commands above are clunky, they're straightfoward. I'm kicking off an instance of PowerShell from Linux to run simple Windows commands. Because those commands are running under Windows, they can trivially access G:.

Or, consider another example. Let's say I want to open a web page from emacs running under Ubuntu. In emacs, I evaluate the following lisp code:

(browse-url "https://blogbyben.com")

The result is that the URL is opened in the Windows version of Chrome. I didn't have configure this behavior, it Just Works. Until it doesn't.

When Linux and Windows Aren't On Speaking Terms

Every once in a while, I'll use the code above and I'll be greeted with a long delay and finally an error message. For example:

$ powershell.exe -C type "'G:\My Drive\data.txt'"
<3>init: (26039) ERROR: UtilAcceptVsock:244: accept4 failed 110

In emacs, running browse-url will occasionally just not work. Digging into the code, I realized that emacs is running xdg-open. Running this command manually, when the integration is broken, results in a similar error message.

$ time xdg-open https://www.blogbyben.com
<3>init: (26536) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26546) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26548) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26550) ERROR: UtilAcceptVsock:244: accept4 failed 110
/bin/xdg-open: 869: firefox: not found
/bin/xdg-open: 869: iceweasel: not found
/bin/xdg-open: 869: seamonkey: not found
/bin/xdg-open: 869: mozilla: not found
/bin/xdg-open: 869: epiphany: not found
/bin/xdg-open: 869: konqueror: not found
/bin/xdg-open: 869: chromium: not found
/bin/xdg-open: 869: chromium-browser: not found
/bin/xdg-open: 869: google-chrome: not found
<3>init: (26599) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26609) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26611) ERROR: UtilAcceptVsock:244: accept4 failed 110
<3>init: (26613) ERROR: UtilAcceptVsock:244: accept4 failed 110
/bin/xdg-open: 869: links2: not found
/bin/xdg-open: 869: elinks: not found
/bin/xdg-open: 869: links: not found
/bin/xdg-open: 869: lynx: not found
/bin/xdg-open: 869: w3m: not found
xdg-open: no method available for opening 'https://www.blogbyben.com'

real    1m20.400s
user    0m0.159s
sys     0m0.067s

What the Heck?

From poking around the web, I learned that the integration between Windows and WSL2 happens via the environment variable $WSL_INTEROP. This variable is set to a file system socket which provides the communication channel.

When I look in /var/run/WSL I see a number of files:

$ ls -1 /var/run/WSL/
10088_interop
11670_interop
13888_interop
1663_interop
17081_interop
1_interop
2189_interop
6106_interop
6268_interop
8855_interop
8_interop

The trick: WSL_INTEROP needs to be set to the right one for proper communication to happen. I confirmed this by going through the list of files, setting each one to WSL_INTEROP and seeing if that fixed the problem. Eventually it did. But trial and error is hardly the way I want to get back on track once Linux and Windows stop talking.

This discussion thread suggests using pstree to automatically set WSL_INTEROP. However the recipe provided didn't work for me.

The goal of the command appears to be to set WSL_INTEROP to the value /var/run/WSL/<PID>_interop where PID is the process ID of the parent's init process.

Looking at my running system, I see that init is running multiple times, and that the PIDs correspond to files in /var/run/WSL.

$ ps auxwww|grep init
root         1  0.0  0.0   1772  1100 ?        Sl   Feb09   0:00 /init
root         8  0.0  0.0   1752    80 ?        S    Feb09   0:00 /init
root      8854  0.0  0.0   1772   100 ?        Ss   Feb10   0:00 /init
root      8855  0.0  0.0   1780   100 ?        S    Feb10   0:01 /init
ben      26051  0.0  0.0   8160  2488 pts/6    S+   06:18   0:00 grep init

After a few attempts, I realized one way to get the correct init PID is to use the last one in the process list. Using grep, tail and awk I can extract this value.

$ ps auxwww|grep init | grep -v grep | tail -1
root      8855  0.0  0.0   1780   100 ?        S    Feb10   0:02 /init
$ ps auxwww|grep init | grep -v grep | tail -1 | awk '{print $2}'
8855

I then added the following to my .bashrc:

$ export WSL_INTEROP=/var/run/WSL/$(ps auxwwww|grep init | \
   grep -v grep | tail -1 | awk '{print $2}')_interop

This code properly sets WSL_INTEROP and revives the Windows / Linux communication channel. With that in place, I'm back to opening up web pages and PowerShell'ing like a champ.

No comments:

Post a Comment