Let’s Write a Lexer in PlayBASIC

October 12, 2025

 

Logo

Introduction

Welcome back, PlayBASIC coders!

In this live session, I set out to build something every programming language and tool needs — a lexer (or lexical scanner). If you’ve never written one before, don’t worry — this guide walks through the whole process step by step.

A lexer’s job is simple: it scans through a piece of text and classifies groups of characters into meaningful types — things like words, numbers, and whitespace. These little building blocks are called tokens, and they form the foundation for everything that comes next in a compiler or interpreter.

So, let’s dive in and build one from scratch in PlayBASIC.


Starting with a Simple String

We begin with a test string — just a small bit of text containing words, spaces, and a number:

s$ = "   1212123323      This is a message number"
Print s$

This gives us something to analyze. The plan is to loop through this string character by character, figure out what each character represents, and then group similar characters together.

In PlayBASIC, strings are 1-indexed, which means the first character is at position 1 (not 0 like in some other languages). So our loop will run from 1 to the length of the string.


Stepping Through Characters

The core of our lexer is a simple `For/Next` loop that moves through each character:

For lp = 1 To Len(s$)
    ThisCHR = Mid(s$, lp)
Next

At this stage, we’re just reading characters — no classification yet.

The next question is: how do we know what type of character we’re looking at?


Detecting Alphabetical Characters

We start by figuring out if a character is alphabetical. The simplest way is by comparing ASCII values:

If ThisCHR >= Asc("A") And ThisCHR <= Asc("Z")
    ; Uppercase
EndIf

If ThisCHR >= Asc("a") And ThisCHR <= Asc("z")
    ; Lowercase
EndIf

That works, but it’s messy to write out in full every time. So let’s clean it up by rolling it into a helper function:

Function IsAlphaCHR(ThisCHR)
    State = (ThisCHR >= Asc("a") And ThisCHR <= Asc("z")) Or _
            (ThisCHR >= Asc("A") And ThisCHR <= Asc("Z"))
EndFunction State

Now we can simply check:

If IsAlphaCHR(ThisCHR)
    Print Chr$(ThisCHR)
EndIf

That already gives us all the letters from our string — but one at a time.

To make it more useful, we’ll start grouping consecutive letters into words.


Grouping Characters into Words

Instead of reacting to each character individually, we look ahead to find where a run of letters ends. This is done with a nested loop:

If IsAlphaCHR(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsAlphaCHR(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    ThisWord$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "Word: " + ThisWord$
    lp = EndPOS
EndIf

Now our lexer can detect whole words — groups of letters treated as a single unit.

That’s the first real step toward tokenization.


Detecting Whitespace

The next type of token is whitespace — spaces and tabs.

We’ll build another helper function:

Function IsWhiteSpace(ThisCHR)
    State = (ThisCHR = Asc(" ")) Or (ThisCHR = 9)
EndFunction State

Then use the same nested-loop pattern:

If IsWhiteSpace(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsWhiteSpace(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    WhiteSpace$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "White Space: " + Str$(Len(WhiteSpace$))
    lp = EndPOS
EndIf

Now we can clearly see which parts of the string are spaces and how many characters each whitespace block contains.


Detecting Numbers

Finally, let’s detect numeric characters using another helper:

Function IsNumericCHR(ThisCHR)
    State = (ThisCHR >= Asc("0")) And (ThisCHR <= Asc("9"))
EndFunction State

And apply it just like before:

If IsNumericCHR(ThisCHR)
    For ChrLP = lp To Len(s$)
        If Not IsNumericCHR(Mid(s$, ChrLP)) Then Exit
        EndPOS = ChrLP
    Next
    Number$ = Mid$(s$, lp, (EndPOS - lp) + 1)
    Print "Number: " + Number$
    lp = EndPOS
EndIf

Now we can identify three types of tokens:

Words (alphabetical groups)

Whitespace (spaces and tabs)

Numbers (digits)


Defining a Token Structure

Up to this point, our program just prints what it finds.

Let’s store these tokens properly by defining a typed array.

Type tToken
    TokenType
    Value$
    Position
EndType
Dim Tokens(1000) As tToken

We’ll also define some constants for readability:

Constant TokenTYPE_WORD        = 1
Constant TokenTYPE_NUMERIC     = 2
Constant TokenTYPE_WHITESPACE  = 4

As we detect tokens, we add them to the array:

Tokens(TokenCount).TokenType = TokenTYPE_WORD
Tokens(TokenCount).Value$    = ThisWord$
TokenCount++

Do the same for whitespace and numbers, and our lexer now builds a real list of tokens as it runs.


Displaying Tokens by Type

To visualize the result, we can print each token in a different colour:

For lp = 0 To TokenCount - 1
    Select Tokens(lp).TokenType
        Case TokenTYPE_WORD:       c = $00FF00 ; green
        Case TokenTYPE_NUMERIC:    c = $0000FF ; blue
        Case TokenTYPE_WHITESPACE: c = $000000 ; black
        Default:                   c = $FF0000
    EndSelect

    Ink c
    Print Tokens(lp).Value$
Next

When we run this version, we see numbers printed in blue, words in green, and whitespace appearing as black gaps — exactly how a simple syntax highlighter or compiler front-end might visualize tokenized text.


Wrapping Up

And that’s it — our first lexer!

It reads through a line of text, classifies what it finds, and records each token type for later use.

The same process underpins many systems:

Compilers use it as the first step in parsing code.

Adventure games might use it to process typed player commands.

Expression evaluators or script interpreters rely on it to break down formulas and logic.

The big takeaway? A lexer doesn’t have to be complicated.

This simple approach — scanning text, detecting groups, and tagging them — is the heart of it. Once you understand that, you can expand it to handle symbols, punctuation, operators, and beyond.

If you’d like to see more about extending this lexer or turning it into a parser, let me know in the comments — or check out the full live session on YouTube.

Links:

  • PlayBASIC,com
  • Learn to basic game programming (on Amazon)
  • Learn to code for beginners (on Amazon)




  • BASIC Isn't Dead. It Just Grew Up.

    June 01, 2025

     

    Logo

    1. BASIC Isn't Dead. It Just Grew Up.

    If you learned to program in the 80s or 90s, chances are your first line of code looked something like this:

    PRINT "Hello, World!"
    

    To many, BASIC feels like a relic of computing’s early days—an outdated teaching tool overshadowed by modern languages like Python or JavaScript. But that assumption couldn't be more wrong. BASIC never disappeared. It diversified. It evolved. And today, it lives on in a wide range of powerful, practical dialects still being used to build games, web tools, business systems, and more.

    2. What Made BASIC Great Then Still Matters Today

    BASIC was designed to be accessible. Its name stands for Beginner's All-purpose Symbolic Instruction Code, and its creators wanted students to focus on learning to solve problems, not memorizing syntax. That simplicity, that clarity, is what makes BASIC surprisingly relevant even now.

    Today's developers value code that is readable, quick to write, and easy to maintain. Sound familiar? That’s the same philosophy behind modern favorites like Python, Lua, and even aspects of Swift. BASIC got there decades earlier.

    3. Visual Basic: The Face Everyone Recognizes

    No conversation about BASIC is complete without mentioning Visual Basic (VB). Introduced by Microsoft in the early 90s, VB turned BASIC into a powerhouse for Windows development. Its visual form designer and event-driven model made it the go-to language for building business applications and internal tools.

    Even today, VB.NET is still supported in Visual Studio, and VBA (Visual Basic for Applications) remains deeply embedded in Microsoft Office, driving automation and macros across the business world.

    But here’s the key point: Visual Basic is just one dialect. BASIC’s legacy didn’t stop with VB—it blossomed into a wide ecosystem of modern tools, many of which continue to be actively developed.

    4. Beyond VB: The Modern BASIC Landscape

    🎮 Game Development

  • PlayBASIC: Designed for 2D game development with beginner-friendly syntax and graphics built-in.
  • BlitzBASIC / Blitz3D / BlitzMax: Known for real-time game development. Still loved by retro game coders.
  • DarkBASIC: Created to simplify 3D game creation on Windows.
  • ⚙️ General Purpose / Desktop

  • PureBASIC: Cross-platform, compiled language with full support for GUIs, DLLs, and multimedia.
  • FreeBASIC: A modern take on QBASIC with low-level access, C-like performance, and inline assembly support.
  • Oxygen BASIC: Lightweight and powerful, compiling directly to machine code.
  • 📱 Mobile and Cross-Platform

  • B4X: Formerly Basic4Android, B4X now targets Android, iOS, and desktop platforms with VB-like syntax.
  • 🌐 Web, Scripting & Automation

  • VBA: Still widely used in Excel and Access to automate reports, calculations, and workflows.
  • BasicAnywhere: A lightweight BASIC interpreter that runs in the browser.
  • 5. BASIC vs. Python: Different Names, Shared Philosophy

    Many developers praise Python for its simplicity, readability, and gentle learning curve. But that exact spirit was BASIC's mission from day one. The syntax and philosophy of BASIC have more in common with Python than most realize:

    REM BASIC
    PRINT "Hello, World!"
    
    # Python
    print("Hello, World!")
    

    Both prioritize clarity over cleverness. Both are great for beginners and prototyping. BASIC simply got there first.

    6. Why BASIC Still Deserves a Place at the Table

    Modern BASICs aren't just toys. They support features you'd expect in any serious language:

  • Compiled executables
  • GUI frameworks
  • Graphics and sound
  • Cross-platform support
  • Integration with system APIs
  • In many cases, these tools are faster to learn and deploy than bloated stacks involving multiple frameworks and languages. That makes BASIC a compelling choice for hobbyists, indie developers, and even small businesses looking for quick, effective solutions.

    7. Final Thoughts: BASIC Isn’t Just Nostalgia—It’s a Toolset That Works

    The myth that BASIC is obsolete is just that—a myth. While it may not dominate headlines, BASIC continues to evolve, empower, and enable. It never stopped being useful. It never stopped being fun.

    If you're a Python fan, or just want to create something without jumping through endless setup hoops, explore modern BASICs. There’s a whole ecosystem waiting for rediscovery.


    Where to Try Modern BASICs:

  • FreeBASIC
  • PureBASIC
  • PlayBASIC
  • B4X
  • QB64
  • BlitzMax NG
  • Oxygen BASIC

  • BASIC didn’t fade away. It just grew up quietly. And it’s still here—faster, friendlier, and more flexible than ever.



    Improving the Performance of the Classic Bubble Sort Algorithm

    May 16, 2022

     

    Sorting algorithms are a crucial part of programming, and choosing the right one for your data is essential for optimal performance. However, even simple algorithms like Bubble Sort can be improved to handle larger datasets more efficiently. In this post, we’ll explore a few ways to optimize the classic Bubble Sort algorithm, using a PlayBASIC example to demonstrate the improvements.

    Understanding Bubble Sort

    Bubble Sort is one of the most commonly taught sorting algorithms in programming. It’s simple to understand but can be slow for large datasets. The concept is straightforward: you iterate through the data, comparing adjacent elements, and swap them if they are in the wrong order. The process repeats until no swaps are necessary, meaning the array is sorted.

    The key flaw of Bubble Sort is that it’s an "n-squared" algorithm, meaning its performance degrades rapidly as the number of elements in the array increases. Despite this, there are still a few optimizations we can apply to make it faster in certain situations.

    Optimizing Bubble Sort

    While Bubble Sort will never be the fastest sorting algorithm, there are ways to make it more efficient for specific datasets. Below are a couple of key improvements that can help speed up the process.

    1. Reduce the Set Size After Each Pass

    One improvement involves reducing the size of the array that’s being processed after each pass. As each pass moves the largest remaining element to the end of the array, you don’t need to check it again in subsequent passes. By decreasing the range of elements to check after each pass, you can reduce unnecessary comparisons and speed up the sorting process.

    2. Bi-directional Bubble Sort

    Instead of only iterating left to right, the Bi-directional Bubble Sort (also known as Cocktail Shaker Sort) goes through the array in both directions. The first pass moves the largest element to the end of the array (just like the classic version), but the next pass moves the smallest element to the beginning of the array. By alternating directions, this approach can reduce the number of passes needed to sort the data.

    Example Code in PlayBASIC

    Here’s an example implementation of these optimizations in PlayBASIC, which demonstrates the classic Bubble Sort alongside the faster variants:


    loadfont "Courier New", 1, 24
    
    MaxItems = 500
    DIM Table(MaxItems)
    DIM Stats#(10, 5)
    
    DO
        Cls
    
        inc frames
        Seed = Timer()
    
        Test = 1
    
        SeedTable(Seed, MaxItems)
        StartInterval(0)
        ClassicBubbleSort(MaxItems)
        tt1 +  = EndInterval(0)
        test = Results("Classic Bubble Sort:", Test, MaxItems, Tt1, Frames)
    
    
    
        SeedTable(Seed, MaxItems)
        StartInterval(0)
        ClassicBubbleSortFaster(MaxItems)
        tt2 +  = EndInterval(0)
        test = Results("Classic Bubble Sort Faster:", Test, MaxItems, TT2, Frames)
    
    
        SeedTable(Seed, MaxItems)
        StartInterval(0)
        BiDirectionalBubbleSort(MaxItems)
        tt3 +  = EndInterval(0)
        test = Results("BiDirectional Bubble Sort:", Test, MaxItems, Tt3, Frames)
    
    
        Sync
    
        REPEAT
        UNTIL enterkey() = 0
    
    LOOP
    
    
    FUNCTION ShowTable(items)
    
        t$ = ""
        n = 0
        FOR lp = 0 to items
            T$ = t$ + str$(table(lp)) + ", "
            inc n
            IF n > 10
                t$ = Left$(t$, Len(t$) - 1)
                print t$
                t$ = ""
                n = 0
            ENDIF
    
        NEXT lp
    
        IF t$ <  > "" THEN print Left$(t$, Len(t$) - 1)
    
    ENDFUNCTION
    
    
    FUNCTION SeedTable(Seed, Items)
        Randomize seed
        FOR lp = 0 to Items
            Table(lp) = Rnd(32000)
        NEXT lp
    ENDFUNCTION
    
    
    FUNCTION ValidateTable(Items)
        result = 0
        FOR lp = 0 to items - 1
            IF Table(lp) > Table(lp + 1)
                result = 1
                exit
            ENDIF
        NEXT lp
    ENDFUNCTION Result
    
    
    
    FUNCTION Results(Name$, index, Items, Time, Frames)
        ` Total Time
        Time = Time / 1000
        Stats#(index, 1) = Stats#(index, 1) + time
    
        print "Sort Type:" + name$
        print "Total Time:" + str$(Stats#(index, 1))
        print "Average Time:" + str$(Stats#(index, 1) / frames)
    
        IF ValidateTable(Items) = 0
            Print "Array Sorted"
            ELSE
            print "NOT SORTED - ERROR"
        ENDIF
        print ""
    
        inc index
    
    ENDFUNCTION index
    
    
    
    
    FUNCTION ClassicBubbleSort(Items)
        Flag = 0
        REPEAT
            Done = 0
            FOR lp = 0 to items - 1
                IF Table(lp) > Table(lp + 1)
                    done = 1
                    t = Table(lp)
                    Table(lp) = Table(lp + 1)
                    Table(lp + 1) = t
                ENDIF
            NEXT lp
        UNTIL done = 0
    ENDFUNCTION
    
    
    
    FUNCTION ClassicBubbleSortFaster(Items)
        Flag = 0
        REPEAT
            Done = 0
            dec items
            FOR lp = 0 to items
                IF Table(lp) > Table(lp + 1)
                    done = 1
                    t = Table(lp)
                    Table(lp) = Table(lp + 1)
                    Table(lp + 1) = t
                ENDIF
            NEXT lp
        UNTIL done = 0
    ENDFUNCTION
    
    
    FUNCTION BiDirectionalBubbleSort(Items)
        First = 0
        Last = Items
    
        REPEAT
            Done = 0
            dec Last
            FOR lp = First to Last
                V = Table(lp + 1)
                IF Table(lp) > V
                    done = 1
                    Table(lp + 1) = Table(lp)
                    Table(lp) = v
                ENDIF
            NEXT lp
    
            IF Done = 1
                Done = 0
                inc First
                FOR lp = Last to First step - 1
                    V = Table(lp - 1)
                    IF V > Table(lp)
                        Done = 1
                        Table(lp - 1) = Table(lp)
                        Table(lp) = v
                    ENDIF
                NEXT lp
            ENDIF
        UNTIL Done = 0
    ENDFUNCTION

    Explanation of the Code

  • Table Initialization: We start by defining an array (`Table`) and filling it with random numbers using the `SeedTable` function.
  • Sorting Functions: Three sorting functions are defined:
  • - `ClassicBubbleSort`: The traditional Bubble Sort that compares adjacent elements and swaps them.

    - `ClassicBubbleSortFaster`: This is an optimized version of the classic algorithm where we reduce the set size after each pass.

    - `BiDirectionalBubbleSort`: This method sorts the array by alternating the direction of passes, improving performance.

  • Performance Tracking: The sorting times are tracked using `StartInterval` and `EndInterval`, allowing us to compare the performance of each sorting method.
  • Results and Performance

    After running the sorting methods, we display the results, including the total time taken and the average time per frame. We also validate that the array is correctly sorted at the end of each method.

    The results can vary depending on the size of the dataset, but in most cases, the optimized versions of Bubble Sort will show significant performance improvements compared to the classic method.

    Final Thoughts

    While Bubble Sort is not the most efficient sorting algorithm, these optimizations provide a good demonstration of how you can improve its performance in certain scenarios. Reducing the size of the set and implementing bi-directional sorting can make the classic Bubble Sort more practical for moderate-sized datasets.

    However, if you’re dealing with larger datasets, it’s often better to use more advanced sorting algorithms like Merge Sort or Quick Sort, which offer much better performance.

    As always, the key takeaway is that sorting is situational, and selecting the right algorithm for your data is essential. These optimizations are not a silver bullet but can provide useful improvements in the right circumstances.

    Have Fun with Sorting!

    Sorting is a fundamental concept in computer science, and experimenting with different algorithms and optimizations can help you understand how they work. Feel free to try out these optimizations in your own projects and see how they perform with your data!

    Links:

  • PlayBASIC,com
  • Learn to basic game programming (on Amazon)
  • Learn to code for beginners (on Amazon)