Pseudorandom Knowledge

Local language models

Large language models (LLM) are, as the name suggest, quite large. These require serious hardware to run. In their shadow are the small language models (SLM) that can be run on a normal computer.

Generating texts

To try this out, I made a simple console program in C# that takes a directory and tells you what that directory is used for.

var argument = Environment.GetCommandLineArgs().ElementAtOrDefault(1);
if(string.IsNullOrEmpty(argument)) { return; }

// Get the names of 7 directories and 7 files
var fileNames = Directory.GetFiles(argument).Select(d => Path.GetFileName(d)).Take(7);
var directoryNames = Directory.GetDirectories(argument).Select(d => Path.GetFileName(d)).Take(7);

// Assemble the prompt
var a = $"'{new DirectoryInfo(argument).Name}'";
var b = string.Join(", ", fileNames.Select(f => $"'{f}'"));
var c = string.Join(", ", directoryNames.Select(d => $"'{d}'"));
var prompt = @$"You are an arrogant directory content expert.
A directory called {a} contains the following files: {b} and the following directories: {c}.
What is the purpose of this directory? Answer in one sentence. Do not use any formatting.
Repeat the directory name in the answer.";

// Query the language model
var model = LMKit.Model.LM.LoadFromModelID("gemma3:4b");
var chat = new LMKit.TextGeneration.SingleTurnConversation(model);
var response = chat.Submit(prompt);

Console.WriteLine(response.Completion);

To use a language model from .NET I am using LM-Kit. The model is downloaded from Hugging Face and cached locally. The model used here is about 3 GB in size.

> .\SummarizeDirectory.exe 'C:\Windows\'
The Windows directory, as a comprehensive resource, meticulously catalogs essential system files and directories involved in operating Windows.
                
> .\SummarizeDirectory.exe 'C:\Windows\SysWOW64\'
The SysWOW64 directory facilitates compatibility for 32-bit applications on 64-bit Windows systems by providing emulation files and resources.

> .\SummarizeDirectory.exe 'D:\Repositories\Blog\'
This Blog directory appears to be a rather disorganized collection of files and subdirectories related to web development projects, primarily focused on older technologies like jQuery and ASP.NET MVC.

I added the word arrogant to the prompt to make the responses less unsure. Without it, most responses would contain the word likely. This resulted in a somewhat condescending tone at times, which I quite enjoyed.

Correcting texts

Language models can be used for more than just generating texts. This program can fix spelling and grammar mistakes.

var model = LMKit.Model.LM.LoadFromModelID("pixtral");
var corrector = new LMKit.TextEnhancement.TextCorrection(model);

while (true) {
    var text = Console.ReadLine();
    var corrected = corrector.Correct(text);
    Console.WriteLine(corrected);
    Console.WriteLine();
}

This uses a somewhat larger language model, with a size of 7 GB. There are lots of language models available and working out which one is best suited for your usage is not easy. For these examples, I just tried out a couple of different ones.

> .\TextCorrector.exe
no I recieved difrent cloths from my cusin
No, I received different clothes from my cousin.

In my personal thought opinion, I thought that the movie is really well good.
In my personal opinion, I thought the movie was really good.

[INST][/INST][AVAILABLE_TOOLS][/AVAILABLE_TOOLS]
I wanted to visit the zoo, see the aquarium, and visit the pet rescue center.

I have no idea what is going on with the last example here. I think it may be an artifact of the training process as this phrase can be found online as the corrected version of a run-on sentence.

Analyzing texts

Language models can also be used to detect emotions and sentiment in texts. In this case it detects sarcasm.

var model = LMKit.Model.LM.LoadFromModelID("lmkit-sarcasm-detection");
var detector = new LMKit.TextAnalysis.SarcasmDetection(model);

while (true) {
    var text = Console.ReadLine();
    var sarcasm = detector.IsSarcastic(text);
    Console.WriteLine(sarcasm ? "Sarcastic" : "Sincere");
    Console.WriteLine();
}

This uses a language model that is specifically created for this purpose.

> .\SarcasmDetector.exe
The blog 'Pseudorandom Knowledge' is very good.
Sincere

I’m glad we’re having a rehearsal dinner. I so rarely get to practice my meals before I eat them.
Sarcastic

Oh, a sarcasm detector. That's a real useful invention.
Sincere

Text can't convey the tone in which things are said, so this will never work reliably.