Not really "now available." Local LLM inferencing has been available for quite a long time now. My personal preference being KoboldCPP since it's leaner and simpler than most.
I suggest anything that runs on Python like that should run through Miniconda or Anaconda. Despite the venv, they make a real mess of things sometimes and distros are now moving away from the user using pip natively anyway. Also most prefer Python 3.10 which is horribly outdated anyway.