Let’s Write a Lexer in PlayBASIC

October 12, 2025

 

Logo

Introduction

Welcome back, PlayBASIC coders!

In this live session, I set out to build something every programming language and tool needs — a lexer (or lexical scanner). If you’ve never written one before, don’t worry — this guide walks through the whole process step by step.

A lexer’s job is simple: it scans through a piece of text and classifies groups of characters into meaningful types — things like words, numbers, and whitespace. These little building blocks are called tokens, and they form the foundation for everything that comes next in a compiler or interpreter.

So, let’s dive in and build one from scratch in PlayBASIC.


Starting with a Simple String

We begin with a test string — just a small bit of text containing words, spaces, and a number:

s$ = "   1212123323      This is a message number"
Print s$

This gives us something to analyze. The plan is to loop through this string character by character, figure out what each character represents, and then group similar characters together.

In PlayBASIC, strings are 1-indexed, which means the first character is at position 1 (not 0 like in some other languages). So our loop will run from 1 to the length of the string.


Stepping Through Characters

The core of our lexer is a simple `For/Next` loop that moves through each character:

For lp = 1 To Len(s$)
    ThisCHR = Mid(s$, lp)
Next

At this stage, we’re just reading characters — no classification yet.

The next question is: how do we know what type of character we’re looking at?


Detecting Alphabetical Characters

We start by figuring out if a character is alphabetical. The simplest way is by comparing ASCII values:

If ThisCHR >= Asc("A") And ThisCHR <= Asc("Z")
    ; Uppercase
EndIf

If ThisCHR >= Asc("a") And ThisCHR <= Asc("z")
    ; Lowercase
EndIf

That works, but it’s messy to write out in full every time. So let’s clean it up by rolling it into a helper function:

Function IsAlphaCHR(ThisCHR)
    State = (ThisCHR >= Asc("a") And ThisCHR <= Asc("z")) Or _
            (ThisCHR >= Asc("A") And ThisCHR <= Asc("Z"))
EndFunction State

Now we can simply check:

If IsAlphaCHR(ThisCHR)
    Print Chr$(ThisCHR)
EndIf

That already gives us all the letters from our string — but one at a time.

To make it more useful, we’ll start grouping consecutive letters into words.


Grouping Characters into Words

Instead of reacting to each character individually, we look ahead to find where a run of letters ends. This is done with a nested loop:

If IsAlphaCHR(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsAlphaCHR(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    ThisWord$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "Word: " + ThisWord$
    lp = EndPOS
EndIf

Now our lexer can detect whole words — groups of letters treated as a single unit.

That’s the first real step toward tokenization.


Detecting Whitespace

The next type of token is whitespace — spaces and tabs.

We’ll build another helper function:

Function IsWhiteSpace(ThisCHR)
    State = (ThisCHR = Asc(" ")) Or (ThisCHR = 9)
EndFunction State

Then use the same nested-loop pattern:

If IsWhiteSpace(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsWhiteSpace(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    WhiteSpace$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "White Space: " + Str$(Len(WhiteSpace$))
    lp = EndPOS
EndIf

Now we can clearly see which parts of the string are spaces and how many characters each whitespace block contains.


Detecting Numbers

Finally, let’s detect numeric characters using another helper:

Function IsNumericCHR(ThisCHR)
    State = (ThisCHR >= Asc("0")) And (ThisCHR <= Asc("9"))
EndFunction State

And apply it just like before:

If IsNumericCHR(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsNumericCHR(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    Number$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "Number: " + Number$
    lp = EndPOS
EndIf

Now we can identify three types of tokens:

Words (alphabetical groups)

Whitespace (spaces and tabs)

Numbers (digits)


Defining a Token Structure

Up to this point, our program just prints what it finds.

Let’s store these tokens properly by defining a typed array.

Type tToken
    TokenType
    Value$
    Position
EndType
Dim Tokens(1000) As tToken

We’ll also define some constants for readability:

Constant TokenTYPE_WORD        = 1
Constant TokenTYPE_NUMERIC     = 2
Constant TokenTYPE_WHITESPACE  = 4

As we detect tokens, we add them to the array:

Tokens(TokenCount).TokenType = TokenTYPE_WORD
Tokens(TokenCount).Value$    = ThisWord$
TokenCount++

Do the same for whitespace and numbers, and our lexer now builds a real list of tokens as it runs.


Displaying Tokens by Type

To visualize the result, we can print each token in a different colour:

For lp = 0 To TokenCount - 1
    Select Tokens(lp).TokenType
        Case TokenTYPE_WORD:       c = $00FF00 ; green
        Case TokenTYPE_NUMERIC:    c = $0000FF ; blue
        Case TokenTYPE_WHITESPACE: c = $000000 ; black
        Default:                   c = $FF0000
    EndSelect

    Ink c
    Print Tokens(lp).Value$
Next

When we run this version, we see numbers printed in blue, words in green, and whitespace appearing as black gaps — exactly how a simple syntax highlighter or compiler front-end might visualize tokenized text.


Wrapping Up

And that’s it — our first lexer!

It reads through a line of text, classifies what it finds, and records each token type for later use.

The same process underpins many systems:

Compilers use it as the first step in parsing code.

Adventure games might use it to process typed player commands.

Expression evaluators or script interpreters rely on it to break down formulas and logic.

The big takeaway? A lexer doesn’t have to be complicated.

This simple approach — scanning text, detecting groups, and tagging them — is the heart of it. Once you understand that, you can expand it to handle symbols, punctuation, operators, and beyond.

If you’d like to see more about extending this lexer or turning it into a parser, let me know in the comments — or check out the full live session on YouTube.

Links:

  • PlayBASIC,com
  • Learn to basic game programming (on Amazon)
  • Learn to code for beginners (on Amazon)




  • Is XOR Decryption in PlayBASIC as Fast as Assembly?

    July 07, 2025

     

    Logo

    🔍 Is XOR Decryption in PlayBASIC as Fast as Assembly?

    Every now and then, a forum question pops up that really catches my attention — and this one did just that. A PlayBASIC user recently asked:

    > "Is using XOR decryption when loading media from memory in PlayBASIC as fast as doing it in assembly?"

    At first, I was a little puzzled. Why? Because the function in question is written in assembly — it's already doing exactly what the user thought might be a separate optimization path. So, let's unpack what's really going on behind the scenes when you XOR encrypted media in memory using PlayBASIC.


    🔐 XOR Media Loading: A Quick Recap

    Years ago, PlayBASIC added support for loading media directly from memory. Earlier versions relied on external packer tools to encrypt and wrap media, but these days, you can load and decode encrypted content entirely from within your program.

    The basic workflow is:

    1. 1. Load your file into memory.
    2. 2. Call the `XORMemory` function with a key.
    3. 3. The content is decrypted and ready to use.

    You can use any XOR key you like. While XOR encryption is relatively simple and easily reversible, it’s still useful for basic protection against casual asset ripping.


    🧠 What Happens Internally?

    When you call `XORMemory`, PlayBASIC doesn’t interpret the data — it pushes the work down to the engine’s internal rendering system. Specifically, it uses the XOR ink mode inside the `Box` drawing function.

    This function writes color data onto a surface by XOR’ing it with the existing pixels. Here’s what makes it cool: that surface isn’t necessarily a visible screen — it's just treated as raw memory.

    To decrypt, the engine:

  • Creates a temporary 32-bit image buffer (must be 32-bit to handle raw data correctly).
  • Loads the encrypted file data into that buffer.
  • Applies the XOR key using the `Box` command in XOR mode.
  • Copies the result back to memory.
  • That’s it.


    💥 But Is It Fast?

    Yes. Very fast — because under the hood, this process is powered by raw MMX assembly.

    When the engine detects MMX support, it uses MMX instructions to process 64 bits (two 32-bit pixels) at a time:

  • Data is loaded into MMX registers.
  • XOR is performed at the hardware level.
  • Results are written back immediately.
  • Here’s the inner loop in plain terms:

  • Load two pixels from memory.
  • Load XOR key into a register.
  • XOR them.
  • Write them back.
  • Repeat in a tight loop.
  • We’re talking near cycle-per-pixel speeds here — hardware-level performance. If MMX isn't available, it gracefully falls back to optimized C code. Either way, you're getting a performance-optimized routine.


    🕰 Legacy Notes

    Older machines or systems using 16-bit display modes may encounter issues unless you force a 32-bit surface. That’s why the engine explicitly creates a 32-bit buffer in the decoding routine — it ensures consistent behavior across different environments.

    Also worth noting: drawing directly to the screen (especially in older systems where the screen buffer lives in VRAM) would be very slow due to the read/write overhead. But modern systems (e.g., Windows 10/11) emulate these surfaces in system memory, allowing direct blending without penalty.


    ✅ Final Thoughts

    So, to answer the original question:

    Yes — XOR decryption in PlayBASIC is as fast as it can be. It’s literally done in machine code.

    This is just one example of how PlayBASIC leans on low-level optimizations to make higher-level features accessible and fast. You get the convenience of a BASIC command, but the performance of assembly behind the scenes.


    Got more technical questions?

    Join the conversation on the forums, or check out the help files for more info about ink modes, memory banks, and low-level drawing operations.


    Tags:

    `#PlayBASIC` `#GameDev` `#Encryption` `#Assembly` `#MMX` `#XOR` `#RetroCoding` `#Performance`

    BASIC Isn't Dead. It Just Grew Up.

    June 01, 2025

     

    Logo

    1. BASIC Isn't Dead. It Just Grew Up.

    If you learned to program in the 80s or 90s, chances are your first line of code looked something like this:

    PRINT "Hello, World!"
    

    To many, BASIC feels like a relic of computing’s early days—an outdated teaching tool overshadowed by modern languages like Python or JavaScript. But that assumption couldn't be more wrong. BASIC never disappeared. It diversified. It evolved. And today, it lives on in a wide range of powerful, practical dialects still being used to build games, web tools, business systems, and more.

    2. What Made BASIC Great Then Still Matters Today

    BASIC was designed to be accessible. Its name stands for Beginner's All-purpose Symbolic Instruction Code, and its creators wanted students to focus on learning to solve problems, not memorizing syntax. That simplicity, that clarity, is what makes BASIC surprisingly relevant even now.

    Today's developers value code that is readable, quick to write, and easy to maintain. Sound familiar? That’s the same philosophy behind modern favorites like Python, Lua, and even aspects of Swift. BASIC got there decades earlier.

    3. Visual Basic: The Face Everyone Recognizes

    No conversation about BASIC is complete without mentioning Visual Basic (VB). Introduced by Microsoft in the early 90s, VB turned BASIC into a powerhouse for Windows development. Its visual form designer and event-driven model made it the go-to language for building business applications and internal tools.

    Even today, VB.NET is still supported in Visual Studio, and VBA (Visual Basic for Applications) remains deeply embedded in Microsoft Office, driving automation and macros across the business world.

    But here’s the key point: Visual Basic is just one dialect. BASIC’s legacy didn’t stop with VB—it blossomed into a wide ecosystem of modern tools, many of which continue to be actively developed.

    4. Beyond VB: The Modern BASIC Landscape

    🎮 Game Development

  • PlayBASIC: Designed for 2D game development with beginner-friendly syntax and graphics built-in.
  • BlitzBASIC / Blitz3D / BlitzMax: Known for real-time game development. Still loved by retro game coders.
  • DarkBASIC: Created to simplify 3D game creation on Windows.
  • ⚙️ General Purpose / Desktop

  • PureBASIC: Cross-platform, compiled language with full support for GUIs, DLLs, and multimedia.
  • FreeBASIC: A modern take on QBASIC with low-level access, C-like performance, and inline assembly support.
  • Oxygen BASIC: Lightweight and powerful, compiling directly to machine code.
  • 📱 Mobile and Cross-Platform

  • B4X: Formerly Basic4Android, B4X now targets Android, iOS, and desktop platforms with VB-like syntax.
  • 🌐 Web, Scripting & Automation

  • VBA: Still widely used in Excel and Access to automate reports, calculations, and workflows.
  • BasicAnywhere: A lightweight BASIC interpreter that runs in the browser.
  • 5. BASIC vs. Python: Different Names, Shared Philosophy

    Many developers praise Python for its simplicity, readability, and gentle learning curve. But that exact spirit was BASIC's mission from day one. The syntax and philosophy of BASIC have more in common with Python than most realize:

    REM BASIC
    PRINT "Hello, World!"
    
    # Python
    print("Hello, World!")
    

    Both prioritize clarity over cleverness. Both are great for beginners and prototyping. BASIC simply got there first.

    6. Why BASIC Still Deserves a Place at the Table

    Modern BASICs aren't just toys. They support features you'd expect in any serious language:

  • Compiled executables
  • GUI frameworks
  • Graphics and sound
  • Cross-platform support
  • Integration with system APIs
  • In many cases, these tools are faster to learn and deploy than bloated stacks involving multiple frameworks and languages. That makes BASIC a compelling choice for hobbyists, indie developers, and even small businesses looking for quick, effective solutions.

    7. Final Thoughts: BASIC Isn’t Just Nostalgia—It’s a Toolset That Works

    The myth that BASIC is obsolete is just that—a myth. While it may not dominate headlines, BASIC continues to evolve, empower, and enable. It never stopped being useful. It never stopped being fun.

    If you're a Python fan, or just want to create something without jumping through endless setup hoops, explore modern BASICs. There’s a whole ecosystem waiting for rediscovery.


    Where to Try Modern BASICs:

  • FreeBASIC
  • PureBASIC
  • PlayBASIC
  • B4X
  • QB64
  • BlitzMax NG
  • Oxygen BASIC

  • BASIC didn’t fade away. It just grew up quietly. And it’s still here—faster, friendlier, and more flexible than ever.