it started with a simple thought: in latin america, many people don’t have resources. many don’t speak english, and often don’t have the chance to learn it. programming can be a way forward, but english raises the barrier. what if there was a language that spoke their language? that’s how hispano-lang began — an open-source experiment, 100% free, created as a bridge for beginners.
i asked first: do spanish languages already exist? yes — several:
- latino: fully in spanish, meant for beginners without the language barrier
- gargar: used in argentinian high schools, pseudocode style, spanish keywords (
si,sino,repetir,mostrar) - español leng: same spirit, with commands like
escribir,leer - linces (lenguaje de computación en español): for spanish-speaking students, basic structures in spanish
even with prior art, it could still be fun to build one — open source, 100% free.
what i’d build with, and whether i know enough
to build “a language”, i’d actually build a compiler or an interpreter (translate to another language or execute directly). with ai around, the learning curve is softer — i won’t reinvent everything.
possible host languages (the implementation language):
- c / c++ (classics; gcc/clang land)
- java (lots of educational compilers)
- python (great for quick prototypes and experimental langs)
- rust / go (modern, memory/concurrency benefits)
lexing/parsing tools:
- lex / yacc (traditional, c)
- flex / bison (modernized)
- antlr (popular; targets java, c#, python, javascript)
- peg.js (parser in javascript)
llvm if i wanted native and fast: emit optimized ir, then real assembly for multiple architectures.
virtual machines: i could write a small vm like python’s bytecode or target an existing vm (jvm, .net clr).
my strongest immediate candidates:
- python: i’m not aiming for a production-grade language right now; this is to help people learn. i already know python.
- rust: modern and fast; lots of companies moving there. caveat: i’ve never written a line of rust.
do i know enough? yes — especially for an educational mini-language, and with ai to assist. i’ll research as i go.
the knowledge i actually need (and what i don’t)
for a serious production language you’d want: compiler theory, formal languages and automata, grammars, lexing, parsing, semantic analysis, code generation, optimization, systems/architecture (memory, cpu, registers, syscalls), language design (imperative, functional, oo, declarative; minimal vs verbose vs domain-specific).
for my mini-language (educational) i can simplify a lot:
- a bit of compiler theory: understand lexer, parser, interpreter
- basic lexing/parsing: convert code into tokens (
si,mientras,mostrar), recognize simple patterns via regex or a light parser - a small interpreter: execute the ast directly; e.g.,
mostrar "hola"becomes a node and callsconsole.log("hola") - language design in spanish: define keywords (
si,sino,repetir,mostrar), pick syntax - optional light semantics: simple checks (variables exist before use; basic type sanity)
key differences: production needs optimization and machine code; an educational mini-language only needs to analyze and execute basic instructions. it could even transpile to javascript or python behind the scenes.
how people will use it (distribution)
i don’t want heavy installers or platform-specific friction. the plan:
- publish the language on npm so anyone can install and run it locally
- build a web playground so learners can try it instantly without installing anything
two paths, same language: npm for local use; the playground for immediate experiments.
interpreter first: the simplest possible thing (javascript)
start minimal. one instruction: mostrar "texto".
function interpret(code) {
const lines = code.split("\n").map(line => line.trim()).filter(line => line)
for (const line of lines) {
if (line.startsWith("mostrar")) {
const match = line.match(/"(.*)"/)
if (match) {
console.log(match[1])
}
}
}
}
// Example usage
interpret(`
mostrar "hola mundo"
mostrar "aprendiendo programación"
`)
the interpreter receives the code as a string, splits lines, ignores empties, if a line starts with mostrar it looks for text in quotes and prints it.
expected output:
hola mundo
aprendiendo programación
now extend it with variables (variable x = 10, mostrar x):
requirements:
mostrar "texto"→ print a stringvariable x = valor→ define numeric or string variablesmostrar x→ print the variable’s value
function interpret(code) {
const lines = code.split("\n").map(line => line.trim()).filter(line => line)
const variables = {}
for (const line of lines) {
if (line.startsWith("variable")) {
const match = line.match(/^variable\s+(\w+)\s*=\s*(.+)$/)
if (match) {
let [, name, value] = match
if (value.startsWith('"') && value.endsWith('"')) {
value = value.slice(1, -1)
} else if (!isNaN(Number(value))) {
value = Number(value)
}
variables[name] = value
}
} else if (line.startsWith("mostrar")) {
const matchString = line.match(/"(.*)"/)
const matchVar = line.match(/^mostrar\s+(\w+)$/)
if (matchString) {
console.log(matchString[1])
} else if (matchVar) {
const varName = matchVar[1]
if (variables.hasOwnProperty(varName)) {
console.log(variables[varName])
} else {
console.error(`Variable "${varName}" is not defined`)
}
}
}
}
}
// Example usage
interpret(`
variable x = 10
variable saludo = "hola mundo"
mostrar x
mostrar saludo
mostrar "aprendiendo fundamentos"
`)
expected output:
10
hola mundo
aprendiendo fundamentos
how it works, step by step:
- the interpreter receives the full text
- it splits into lines, trims, and filters empties
- if a line starts with
variable, it storesname = valuein avariablesdictionary (numbers parsed withNumber, strings stripped of quotes) - if a line starts with
mostrar, it either prints a quoted string or looks up a variable and prints its value - net effect: a tiny engine that understands two spanish instructions (
variable,mostrar) and translates them into javascript operations (store in a map andconsole.log)
example of preprocessing:
"variable x = 10\nmostrar x\n"
→ ["variable x = 10", "mostrar x"]
from quick hack to something i can grow
this “if-ladder + regex” works, but will become unreadable and hard to extend. i want it professional, scalable, and easy to read. time to move to a classic three-layer architecture.
architecture (three layers):
source code (spanish)
↓
tokenizer → converts text to tokens
↓
parser → converts tokens to ast
↓
evaluator → executes the ast
directory structure:
javascript-interpret/
├── main.js # entry point
├── package.json
├── README.md # docs in spanish
├── README.en.md # docs in english
├── src/
│ ├── tokenizer.js # lexical analysis
│ ├── parser.js # syntax analysis
│ ├── evaluator.js # evaluation/execution
│ └── interpreter.js # coordinator
└── test/
└── test.js # automated tests
components:
1. tokenizer (src/tokenizer.js)
function: lexical analysis — recognizes keywords, numbers, strings, symbols
input: "variable edad = 25"
output: [ {type: "VARIABLE"}, {type: "IDENTIFIER", lexeme: "edad"}, ... ]
2. parser (src/parser.js)
function: syntax analysis — builds the ast
input: token list
output: ast (abstract syntax tree)
3. evaluator (src/evaluator.js)
function: evaluation — executes the program
input: ast
output: program results
4. interpreter (src/interpreter.js)
function: orchestration — runs tokenizer → parser → evaluator
input: full source code
output: final result
execution flow example ("variable edad = 25"):
// 1. TOKENIZER
"variable edad = 25" →
[
{type: "VARIABLE", lexeme: "variable"},
{type: "IDENTIFIER", lexeme: "edad"},
{type: "EQUAL", lexeme: "="},
{type: "NUMBER", lexeme: "25", literal: 25}
]
// 2. PARSER
Tokens →
{
type: "VariableDeclaration",
name: "edad",
initializer: {type: "Literal", value: 25}
}
// 3. EVALUATOR
AST → executes: environment.define("edad", 25)
// 4. RESULT
Variables: { edad: 25 }
design principles
- modularity: each component has a single responsibility
- simplicity: start with two commands (
variable,mostrar) and minimal syntax - extensibility: structure ready to add features later
- testability: unit test each module
- readability: clear separation of concerns
- educational value: a good mental model for how interpreters work
advantages of this architecture
- scalable: easy to add tokens, operators, statements
- maintainable: each file has a focused purpose
- testable: tokenize/parse/evaluate independently
- readable: code stays organized
- educational: the structure mirrors compiler theory
tokenizer, conceptually
scanToken() {
const char = this.advance();
switch (char) {
case '=': this.addToken("EQUAL"); break;
case '"': this.string(); break;
default:
if (this.isDigit(char)) this.number();
else if (this.isAlpha(char)) this.identifier();
}
}
parser, conceptually
variableDeclaration() {
const name = this.consume("IDENTIFIER", "Expected variable name");
let initializer = null;
if (this.match("EQUAL")) {
initializer = this.expression();
}
return { type: "VariableDeclaration", name: name.lexeme, initializer };
}
evaluator, conceptually
execute(statement) {
switch (statement.type) {
case "VariableDeclaration":
this.executeVariableDeclaration(statement);
break;
case "MostrarStatement":
this.executeMostrarStatement(statement);
break;
}
}
executeVariableDeclaration(statement) {
let value = null;
if (statement.initializer !== null) {
value = this.evaluateExpression(statement.initializer);
}
this.environment.define(statement.name, value);
}
interpreter orchestration, conceptually
interpret(source) {
try {
const tokens = this.tokenizer.tokenize(source);
const parser = new Parser(tokens);
const statements = parser.parse();
const output = this.evaluator.evaluate(statements);
return { success: true, output, error: null };
} catch (error) {
return { success: false, output: [], error: error.message };
}
}
adding control flow: si / sino (if / else)
next, i add conditionals to make the language actually branch.
new keywords
si,sino,verdadero,falso
comparison operators
>,>=,<,<=,==,!=
arithmetic operators
+,-,*,/, and unary-+concatenates strings too
block syntax
{and}delimit blocks
tokenizer additions (concept)
const keywords = {
si: "SI",
sino: "SINO",
verdadero: "TRUE",
falso: "FALSE"
}
// operators, braces, etc.
// case ">": this.addToken("GREATER"); break;
// case "==": this.addToken("EQUAL_EQUAL"); break;
// case "{": this.addToken("LEFT_BRACE"); break;
parser additions (concept)
// parse if
ifStatement() {
const condition = this.expression();
this.consume("LEFT_BRACE", "Expected { after condition");
const thenBranch = this.block();
this.consume("RIGHT_BRACE", "Expected } after block");
let elseBranch = null;
if (this.match("SINO")) {
// parse else branch
}
return { type: "IfStatement", condition, thenBranch, elseBranch };
}
evaluator additions (concept)
// conditions
executeIfStatement(statement) {
if (this.isTruthy(this.evaluateExpression(statement.condition))) {
this.executeBlock(statement.thenBranch);
} else if (statement.elseBranch !== null) {
this.executeBlock(statement.elseBranch);
}
}
isTruthy(value) {
if (value === null) return false;
if (typeof value === "boolean") return value;
return true;
}
// comparisons
evaluateBinaryExpression(left, operator, right) {
switch (operator) {
case "GREATER": return left > right;
case "EQUAL_EQUAL": return left === right;
case "GREATER_EQUAL": return left >= right;
// ... other operators
}
}
resulting syntax
// simple conditional
variable edad = 18
si edad >= 18 {
mostrar "mayor de edad"
}
// with else
variable edad = 16
si edad >= 18 {
mostrar "mayor de edad"
} sino {
mostrar "menor de edad"
}
architecture stays the same: tokenizer → parser → evaluator; the ast now includes IfStatement.
result: the language makes decisions and reacts to conditions — much more useful for actual learning.
where this leaves me
this is an educational, spanish-first language: minimal, approachable, open source. it will be installable via npm and usable instantly in a web playground. i started with the tiniest interpreter, then moved to a modular architecture i can extend. from mostrar and variable to si/sino, the path is clear.
i’ll keep adding features over time, but the idea is already there: a small language that lowers the barrier for beginners who can’t rely on english, and a codebase that’s clean enough to learn from as it grows.
repo: hispano-lang