Saturday, October 8, 2022

Running a vaadin tomcat cluster behind a reverse proxy

Vaadin is full stack java framework to write web applications.

One of the interesting parts is, that you can write the UI completly in java, so you won't have to mess with different technologies and languages.

On the other side, it's easy to include webcomponents or write your own and connect them to your java application.

Vaadin also has a powerfull integrated push system, which allows you to push UI updates/notifications to the client when they are "ready". It's based on athmosphere and can benefit from websockets, but also works/falls back to standard http long polling http requests if nothing else works.

If you are using a single tomcat / servlet container instance, then you either expose that one to your customers or put some proxy in front of it. It's pretty standard and you find a lot of examples on how to do this with nginx or apache. You can either use plain http+websocket proxy, or (in the case of apache) also the ajp connector to the backend service.

But if you have multiple backend tomcat / sevlet containers running for redundacy/scaling/... then things get more complicated.

The vaadin framework is usually statefull, you you just can't route each new request to any one of your backend servers. You will have to always route the requests to the same servlet engine.

This is called "sticky session", since a session is sticking on the same backend server.

When you use nginx as the frontend/loadbalancer service, then you will get sticky sessions too, but only based in the ip (and port) of the client. https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/

This works fine as long as you users are spread over the world (or better use different IP addresses), the load is distributed amoung the backend servers.

But if you have a company application, which is for example used at two locations in the world, with each 50 users, the distribution of the requests can occur to route all to exaclty one of the backend servers and the others waiting for work to do.

This is because nginx uses the hash of the client public ip address, so all users behind the same ip will be routed to the same backend, which isn't what we intend.

Nginx also has the ability to route the requests based on a cookie you define (on in case of servlet engines is caled JSESSIONID). But that feature is not available in the free nginx version, it's premium feature requiring a pricy subscription. https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/#enabling-session-persistence

So if you wan't to stay free of costs for the load balancer you have to choose something else.

There exist dedicated HA proxies for this, but we did choose to use the Apache webserver for this.

The basic confirguration for a apache reverse proxy is simple, a bit more complex with load balancing and even less documented for load balancing with websockets.

This is why this post exists, I did search (most of the?) internet for informations on how to achieve this, but only did find fragments of the solution and often with wrong parts in it.

So our solution does correctly proxy http/http2/websocket requests to the backend servers in a sticky way. The ssl/https configuration is not part of this post, but you can use standard ways for this.

This setup is valid for all Vaadin Flow 23.0 and 23.1 and 23.2. setups. For Vaadin 23.3 and later, the push endpoints have been simplified for a better lb configuration. https://github.com/vaadin/flow/issues/14641

So here what our config looks like, we will explaing the severals parts later:

# 1

ProxyRequests Off

#2
ProxyPass /images/ !
ProxyPass /.well-known/ !

#3

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/ [NC]
RewriteCond %{QUERY_STRING} transport=websocket [NC]
RewriteRule /(.*)       balancer://backend-ws/$1 [P,L]
 

#4
ProxyPass / balancer://backend/
ProxyPassReverse / balancer://backend/

#5

<Proxy balancer://backend>
    BalancerMember http://192.168.1.50:8080 route=backend1
    BalancerMember http://192.168.1.51:8080 route=backend2
    ProxySet stickysession=JSESSIONID
</Proxy>
 

#6
<Proxy balancer://backend-ws>
    BalancerMember ws://192.168.1.50:8080 route=backend1
    BalancerMember ws://192.168.1.51:8080 route=backend2
    ProxySet stickysession=JSESSIONID
</Proxy>

And in your tomcat server.xml on the backend servers:

<Engine name="Catalina" defaultHost="backend.service.ch" jvmRoute="backend1">

...

</Engine>

So lets explain the parts:

1. General proxy config

ProxyRequests Off

Make sure to have this in your config, otherwise your webserver can be missused to proxy any request to the internet, turning your server in an open proxy.

#2
ProxyPass /images/ !
ProxyPass /.well-known/ !

With these you can serve static content direcly from you apache webserver (As long as it has access to that content)

The .well-known this is usually needed when you use letsencrypt certificates for https

#3

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/ [NC]
RewriteCond %{QUERY_STRING} transport=websocket [NC]
RewriteRule /(.*)       balancer://backend-ws/$1 [P,L]

The roles above make sure to correctly handle http->websocket upgrade requests and send them to the websocket balancer.

Depending on you application/backend you will need to tune the rewrites, but these here work for a vaadin application.

It is known that a rewrite rule is not optimal from a performance point of view, but so far I know of no other solution, until Vaadin 24 will hopefully use a dedicated push/websocket endpoint.

#4
ProxyPass / balancer://backend/
ProxyPassReverse / balancer://backend/

Here we route the normal http and http2 requests to the http balancer. Please take care to include the trailing / after the backend, otherwise you will receive strange errors like "No protocol handler was valid for the URL /home (scheme 'balancer')" in your server error log

#5

<Proxy balancer://backend>
    BalancerMember http://192.168.1.50:8080 route=backend1
    BalancerMember http://192.168.1.51:8080 route=backend2
    ProxySet stickysession=JSESSIONID
</Proxy>
This is the load balancer to route the requests to the two backend servers, you can add more if you have more of them.

The stickysession indicates to use the JSESSIONID cookie to match the requests to the correct backend. The name of the route should match your jvmRoute entry in the server.xml file for each backend.

#6
<Proxy balancer://backend-ws>
    BalancerMember ws://192.168.1.50:8080 route=backend1
    BalancerMember ws://192.168.1.51:8080 route=backend2
    ProxySet stickysession=JSESSIONID
</Proxy>

Same as #5, but for the websocket requests.

As you see in the balancer definitions, the backend servers are connected via unencrypted http/ws. If you need to use https/wss toward the backend servers too, then you can just replace the backend server definitions with https/wss. But of course you will then also have to handle the certificates on tomcat side too.

Required apache modules for this to work:

Apache should use the event mpm if possible, for better handling of http2 and websokets. Other mpm might work, I have not tested them.

As for the modules themself, enable these:

http2 -> For http/2 of course

proxy -> General basic proxy funcionality

proxy_balancer -> To used balancers toward the backend

proxy_http -> For http 1.x proxy requests

proxy_http2 -> For http/2 proxy requests

proxy_wstunnel -> For websocket proxy stuff

rewrite -> To identify and redirect websocket requests to the ws balancer

lbmethod_byrequests -> type of loadbalancing to use

In debian you can just use a2enmod <module_name> to enable them, on other distributions when commands vary, but finally these modules must be active to have a full http/http2/ws load balancer for Vaadin.

This setup work for Vaadin 23.1.x, for Vaadin 24 there is a discussion going on about having dedicated endpoint for push/websockets. https://github.com/vaadin/flow/issues/14641#issuecomment-1266519119

And a documentation (still to be done) about Vaadin and reverse proxy setups

https://github.com/vaadin/docs/issues/1776#issuecomment-1272384234


Wednesday, May 18, 2022

Cleanup huge WindowsApps folder

There is an updated post here, with less details, but including the script.

In some cases the "c:\Program Files\WindowsApps" folder starts to fill the harddisk.

We had one customer with a 256GB ssd, where the WindowsApps folder did take 180GB.

So what's going on, and how to cleanup this?

It's not clear why there are so many orphaned versions/installations in the WindowsApps folder, must be some bug somewhere in Windows 10 to cause this.

So, the first question is not realy answered, so what about the cleanup?

You can google for this and find a lot of results, but not many with a real solution.

The best post I did find is this one: https://www.tenforums.com/performance-maintenance/185009-how-clean-up-windowsapps-folder-3.html with a script to detect all orphanes on page 3 of the answers/discussions.

It works like a charm, the only thing missing is actually deleting the folder/files from the standard system, and not the recovery console.

The reason you need the recovery console, is the NFTS owner and rights of that special folder.

But of course there is a way arround this too:

1. Take ownership of the folder (and it's content) with takeown

2. Set the acl's to allow the current user to delete the folder+content with icacls

3. Finally, with 182 orphan folder, you don't want to accept each deletion manually, so we add the /Q argument to the RD command.

A small note, the takeown command uses either /d y on english systems for the confirmation or on german systems the /d j argument. So modify that line to match your confirmation letter

The lines from line number 120 onwards looks then like this

echo rem folder "%%z" (about !oldlineSize! Bytes^, about !SizeMB! MBytes^)
echo takeown /F "%WA%\%%z" /r /d y
echo icacls "%WA%\%%z" /t /grant %USERNAME%:F

echo RD /Q /S "%WA%\%%z"

So you can now run the script as admin, rename the resulting file into delorpahns.cmd and run that one as admin too and your WindowsAppps folder is clean once more.

 

To have the complete script available in one place, here my enhanced version:

 

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: By Einstein1969 for www.tenforum.com
::
%= CleanWA =% @set "Version=0.1.4 BETA"
::
:: Detect orphaned dirs in %ProgramFiles%\WindowsApps, check for integrity/consistency
::
:: Requirements: Save to UTF-8. Use Lucida Console Font in CMD window. Run with double click over icon script.
::         Windows 10 version 1511 (build number 10586) onward
::
:: history:
::   04/02/2022 0.1.4 BETA      fix some minor bug and aesthetic improvements
::     /10/2021 0.1.4 BETA    fix bug for elevated char ;,=() &^
::   29/09/2021 0.1.3 BETA     Search for applications that do not have InstallLocation set.
::                Enable support for Unicode UTF-8 and VT-ANSI
::   25/09/2021    0.1.2 BETA     removed debug, added run as administrator , thanks Matthew Wai
::   25/09/2021    0.1.1 BETA     add debug instruction for "file not found" bug/error.
::
:: ref: https://www.tenforums.com/tutorials/4689-uninstall-apps-windows-10-a.html
:: ref: https://docs.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences
::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
@echo off & setlocal EnableDelayedExpansion&color 1f&title CleanWA %Version%

Rem Choice the WindowsApps (WA) directory to check.
set "WA=%ProgramW6432%\WindowsApps"


Rem Check for Run as administrator.
call :Restart_as_Administrator %1

Call :video_setting

Rem check for permission on WA
if not exist "%WA%\." echo( & echo Error: Problem accessing "%WA%" & goto :the_end

echo(
echo   • Analyze "%WA%"
echo(

Echo     • Check for applications that do not have "InstallLocation" set.
echo(

rem create list of registered app
> "%tmp%\CleanWA.RegisteredApps.tmp.txt"  ^
 (
    rem exclude "system" applications in c:\windows folder
    powershell -Command "Get-AppxPackage -AllUsers | ? { $_.SignatureKind -ne 'System' } | sort -property {$_.InstallLocation+' '} | ForEach-Object {'{0,-40}  {1,-20}  {2}' -f $_.name,$_.version,$_.InstallLocation}"
 )

type nul: > "%tmp%\CleanWA.RegisteredApps_good.tmp.txt"
type nul: > "%tmp%\CleanWA.RegisteredApps_NO_good.tmp.txt"
FOR /f "usebackq tokens=1,2,*" %%a in ("%tmp%\CleanWA.RegisteredApps.tmp.txt") do (
    if "%%c" equ "" (
        echo(    %CSI%7m? Warning: Registered app "%%a" does not have a "InstallLocation" set/defined. ?%CSI%27m
        echo(
        echo Rem Warning: Registered app "%%a" does not have a "InstallLocation" set/defined. >> "%tmp%\CleanWA.RegisteredApps_NO_good.tmp.txt"
    ) else echo(%%a %%b %%c >> "%tmp%\CleanWA.RegisteredApps_good.tmp.txt"
)

echo     v check finished.
echo(

echo     • Detect orphaned dirs
echo(

type nul > "%tmp%\CleanWA.report_orphans.tmp.txt"
set "dn=0" & rem number of folders/directories
set /A "Size=0, Totsize=0,dirsO"
FOR /f "tokens=*" %%z IN ('dir /b /o:n "%WA%"') DO (
  set /a "dn+=1"
 
  FOR /f "tokens=1-4" %%g in ('dir /S /W /-C "%WA%\%%z"') do (set "oldlineSize=!line!" & set line=%%i)
  call :pad dn 4
  set /A "size=!oldlineSize:~0,-4!+0, sizeMB=size/105"
  echo       • !dn! - Searching for folder:
  echo                "%%z" (about !SizeMB! MBytes^) %CSI%0K

  set "found="
  for /f "usebackq tokens=1,2,*" %%a in ("%tmp%\CleanWA.RegisteredApps_good.tmp.txt") do if not defined found (
        rem echo       • Pkg. InstallLocation "%%c" %CSI%0K %RI%
        set _V_=%%z
        set v1="x!_V_:%%a_%%b=!x"
        set v2="x!_V_!x"
        if !v1! NEQ !v2! (
        set Found=1
        rem echo(      %CSI%102m? Found an app that use this folder ?%CSI%44m : %CSI%102m"%%a"%CSI%44m version: "%%b" %CSI%0K
    )
  )

  if defined found (
    echo(%RI%%RI%%RI%
  ) else (
    rem check for unknown folder/dir
    rem known:_x64_, _x86_, _neutral_ .... others?
        set v1="x!_V_:_x64_=!x"
        set v2="x!_V_!x"
    set "OK=N"
        if !v1! NEQ !v2! set "OK=Y"
        set v1="x!_V_:_x86_=!x"
    if !v1! NEQ !v2! set "OK=Y"
    set v1="x!_V_:_neutral_=!x"
    if !v1! NEQ !v2! set "OK=Y"

    if !OK! NEQ Y (
        echo(
        echo       %CSI%43m? No Match folder: "%%z" ?%CSI%44m%CSI%0K
        echo(
        echo(
        ping -n 2 127.0.0.1 >nul
    ) else (
        echo(
        echo       %CSI%101m? orphans folder! ?%CSI%44m%CSI%0K
        echo(
        echo(
        rem why 105? 1024*1024/10000 ~ 105
        set /A "Totsize+=size, TotsizeMB=Totsize/105, dirsOrphans+=1"
        title CleanWA %Version% [Tot. space orphans: about !TotsizeMB! MB] [dirs/folder orphans: !dirsOrphans!]
        (
          echo rem folder "%%z" (about !oldlineSize! Bytes^, about !SizeMB! MBytes^)
          echo takeown /F "%WA%\%%z" /r /d y
          echo icacls "%WA%\%%z" /t /grant %USERNAME%:F
          echo RD /Q /S "%WA%\%%z"
          echo(
                ) >> "%tmp%\CleanWA.report_orphans.tmp.txt"
    )
  )
  rem pathping 127.0.0.1 -n -q 1 -p 100 >nul
)
echo(
echo(
echo(
echo     v check finished.
echo(
echo dirs/folder orphans: !dirsOrphans!
echo(
echo Tot. space orphans: about !TotsizeMB! MB
echo(

>nul: copy /a "%tmp%\CleanWA.RegisteredApps_NO_good.tmp.txt" + /a "%tmp%\CleanWA.report_orphans.tmp.txt" "%tmp%\CleanWA.report.tmp.txt"

echo coping Report/script "%tmp%\CleanWA.report.tmp.txt" for offline delete in \Users\Public
copy "%tmp%\CleanWA.report.tmp.txt" \Users\Public
echo(
pause

start notepad "\Users\Public\CleanWA.report.tmp.txt"

:the_end
:: pause if double clicked on instead of run from command line.
echo %cmdcmdline% | findstr /I /L %comspec% >nul 2>&1
if %errorlevel% == 0 echo( & pause
exit /B 0
goto :eof
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::   SUBROUTINE   ::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:Restart_as_Administrator
(Fsutil Dirty Query %SystemDrive%>Nul)&&(
        if "%1" neq "admin" (
        start /MAX %~f0 admin
        Exit
    )
)||(
    mode con cols=90 lines=20
    echo( & echo     It is necessary to start the script with administrative rights.
    echo( & echo     Please wait ... I am restarting the script with administrative rights.
    echo( & echo     Answer "YES" to the next User Account Control UAC request to continue
    echo(     running this script with administrative permissions. & echo(
    timeout /t 4
    powershell.exe -c "Start -WindowStyle Maximized -Verb RunAs cmd /c, ("^""%~f0"^"" -replace '[;,=() &^]', '^$0'), "admin" " & Exit
)
goto :eof
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:video_setting
    For /F %%a In ('echo prompt $E^| cmd') Do Set "ESC=%%a"
    set "CSI=%ESC%[" & set "RI=%ESC%M"
    set "echoVT=<nul set/p.="
        rem get windows size
    for /f "tokens=1,2 skip=3" %%A in ('powershell -command "$Host.ui.rawui.WindowSize"'
    ) do set /a windowWidth=%%A, windowHeight=%%B, sm_e=%%B - 3
    mode con: COLS=%windowWidth% LINES=%windowHeight%
    Rem Setting Scrolling Margins
    echo %CSI%4;%sm_e%r
    rem set UTF-8
    chcp 65001
    cls
goto :eof
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:pad
  set "pad=!%1!"
  for /L %%L in (1,1,%2) do set "pad= !pad!"
  set "pad=!pad:~-%2!"
  set "%1=!pad!"
goto :eof
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::



Thursday, April 21, 2022

MS SQL Server can't register SPN when started with a service account

When you change the service account used by the MS SQL server services, they often are not abel to register the corresponding SPN in active directory.

You should see these messages in the SQL server log when it's working correctly

The SQL Server Network Interface library successfully registered the Service Principal Name (SPN) [ MSSQLSvc/SQL.testdomain.in:24629 ] for the SQL Server service.

SQL Server error log for default instance

In the case you get errors like this, the SPN registration (and therefore later on the lookup via AD) is not working

The SQL Network Interface library could not register the Service Principal Name (SPN) for the SQL Server service. Error: 0x54b. Failure to register an SPN may cause integrated authentication to fall back to NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies. 

 For a service account to be able to register the SPN you need to set these rights on the AD user account:

Now you just have to restart the MS SQL service, and it should be able to register the SPN in AD.

If you still receive the error message, then it's because the corresponding SPN's are still/already registered on another object (mostly the computer account of the MS SQL server) and the service account has no rights to modify them (Since we did only allow it to modify it's own rights)

So the simplest way to do this is to use the setspn command to remove the stale entries.

You can look what SPN entries are registered for a specific AD object with this command:

setspn -l domain\sql-server

You will then probably see something like this:

  • MSSQLSvc/ sql-server.domain.local
  • MSSQLSvc/ sql-server.domain.local:1433 

So you can also remove them from the object with:

setspn -d  MSSQLSvc/sql-server.domain.local domain\sql-server

setspn -d  MSSQLSvc/sql-server.domain.local:1433 domain\sql-server

Now you restart the MS SQL service and it should be able to register.

If it still throws errors, then the SPN is probably assigned to another account. In that case just try to add the spn manually, and it will tell you where the duplicate SPN can be found.

setspn -a  MSSQLSvc/sql-server.domain.local domain\service-account

This can also happen when you switch from one service account to another.



Monday, February 28, 2022

Outlook does not allow you to subscribe to all calendars of distribution lists

 In previous Outlook versions you could subsribe to all calendars of all members of a distribution group, by just selecting it in the address book.

With current Outlook versions this mostly don't work and you receive the message that you have to subscribe to the calendars one by one.

In german this is the error message you get: 

The same behaviours also occurs, when you try to subscribe to all rooms at the same time.

To get this working again, you have to disable the "Shared calendar improvements", at least for the duration when subscribing to the calendars.

When this checkbox is disabled, you can again subscribe multiple calendars in one step.

You can turn on the shared calendar improvements with a checkbox.

https://support.microsoft.com/en-us/office/outlook-calendar-sharing-updates-c3aec5d3-55ce-4cea-84b0-80aab6d8dc26

Please be aware, that you should reenable this setting after subscription, otherwise your subscribed calendard will get lost when you have no connection to the server, or working offline.


 And a small side note: The GPO policy mentioned in the linked article no longer exists, so don't look for it in current admx templates.

Also setting the mentioned policy registry key seems to have no effect with current outlook versions as of 2022.

There is a new setting for REST communication with calendar server, but this seems to not solve the subscribe problem.