Introduction

Welcome to the Dewy Programming Language Book. This book covers all aspects of the language, including syntax, style, the standard library, examples, etc.

Please note that this book and the language are still VERY work in progress. The book is still missing many chapters, and the current language implementation is very incomplete.

What is Dewy

The Dewy Programming Language is a simple yet powerful general purpose language designed with engineering applications in mind. Think the functionality and ease of use of matlab or python combined with the speed of a compiled language like C or Rust

Who is Dewy for

Dewy is for everyone! Dewy is designed to be easy to learn and use, while still being powerful enough to be used for real world applications. Dewy is designed to be a general purpose language, and can be used for anything from simple scripting to complex engineering applications.

Features

Dewy has many unique and uncommon features allowing it to be powerful and easy to use. Some key features include:

Functional and Imperative - Dewy is an imperative language with strong support for functional programming. This allows for a very flexible programming style, where you can use the best tool for the job.
Expression based syntax - Dewy uses an expression based syntax, meaning that everything is an expression. This allows for a very simple yet powerful syntax, where common language features often are just a free consequence of the syntax
Garbage-collector-free memory management - Dewy uses a unique memory management system, allowing for fast and efficient memory management without the need for a garbage collector.
Strong type system - Dewy has a powerful static type system with inference, reminiscent of those in Typescript and Julia.
Built in unit system - Dewy has a built in unit system, allowing you to easily work with units and convert between them. This is especially useful for engineering applications.
Strong math support - Dewy has strong support many math features, including complex numbers, quaternions, vectors, matrices, and more. This is especially useful for engineering applications.

Getting Started

Dewy is still in it's early development stages, but you can try it out! Currently this chapter explains how to use the current python interpreter implementation:

Online interpreter
Installing the interpreter on Linux
Writing hello world

Long term, this chapter will explain how to use the compiler, and provide several hello world examples for various different applications of Dewy.

Installing the compiler on Linux, Windows, and Mac
Writing hello world
Simple examples for other domains

Online Interpreter

This is a simple online interpreter for the Dewy programming language. It's very work in progress, and only supports a subset of the language features, but it should give a good idea of what the language is like.

Long term, this will be replaced with a less janky version that supports the full language.

Installation

Currently, only the python interpreter backend is available. To install it, you must have python 3.11 or later installed. Then to install:

Clone the repo

$ git clone git@github.com:david-andrew/dewy-lang.git

Run the install script
```
$ cd dewy-lang
$ python install.py
```
Log out and back in

Verify Install

When you do have the language properly installed, you should be able to verify that it works like so:

$ dewy -v
Dewy 0.0.0

Hello, World!

It's traditional in most languages to write a small program that prints "Hello, World!" to the screen. Achieving this is super simple in Dewy!

Put Your Code in a Directory

It's probably a good idea to put your code in a dedicated folder.

$ mdkir ~/code
$ cd ~/code
$ mdkir hello_world
$ cd hello_world

Write the Source Code

Next we'll create the source file. In a text editor of your choosing, create a file called hello.dewy.

Then in the text editor, enter the following code

When you are done in the text editor, save and close the file.

Run the Code

Running a dewy file is as simple as invoking the file with the dewy command

$ dewy hello.dewy

Which should print Hello, World! in the terminal.

How it Works

This code invokes the printl function with the string 'Hello, World!'. printl is a commonly used function that takes text and prints it to the terminal, followed by a newline.

Compiling and Running Are the Same Step

NOTE: this is not relevant until the LLVM/other compiler backends are implemented.

When you run the program, you are actually doing two things: first compiling, and then running.

Compiling is the process that translates the code from Dewy, which your computer doesn't understand natively, to machine language which it does understand. The resulting translation is saved to a file, called an executable, that your computer can run directly. Once the executable is created, the dewy command then automatically runs it for you.

All of this goes on under the hood, so you don't have to worry about it. But you might notice the effects of this process, e.g. the first time you run a program, it might take a bit longer than subsequent runs. Additionally, you might notice a hidden directory containing the executable, and perhaps other files related to the compilation process. In this case, the directory is called .hello/ and contains the executable hello.

Hello, Many Worlds!

This chapter contains short example programs over a variety of different domains--basically a Hello World for each problem. Each of these should serve as a quick start guide for common tasks in Dewy.

GUI: creating a window with a click counter
Graphics: drawing a triangle from scratch
Audio: playing a sin wave through the speakers
Networking: building a simple client/server chat program
2D Game Development: making a 2D Flappy Bird clone
3D Game Development: making a 3D Racing Game
Web Development: building a simple website
Databases: creating a simple data store
Cryptography: encrypting and decrypting a string
Operating Systems: running Hello World on bare metal
Compilers: building a toy compiler
Scientific Computing: making an Infinite Zooming Mandelbrot Set
Robotics: Forward and inverse kinematics of a 6-DoF robot arm
Machine Learning: training a simple MLP on MNIST

GUI

(TODO: creating a window with a click counter)

Graphics

(TODO: drawing a triangle from scratch)

Audio

(TODO: playing a sin wave through the speakers)

Networking

(TODO: building a simple client/server chat program)

Game Development

(TODO: making a 2D Flappy Bird clone)

Game Development

(TODO: making a 3D Racing Game)

Web Development

(TODO: building a simple website)

Databases

(TODO: creating a simple data store)

Cryptography

(TODO: encrypting and decrypting a string)

Operating Systems

(TODO: hello world boot loader program on raspberry pi)

Compilers

(TODO: building a toy compiler. perhaps something stack based, e.g. simplified porth. or perhaps something like a toy dewy/c/etc. compiler.)

Scientific Computing

(TODO: Making an Infinite Zooming Mandelbrot Set)

Robotics

(TODO: forward and inverse kinematics of a 6DoF robot arm using homogeneous transformation matrices)

Machine Learning

(TODO: training a simple MLP on MNIST)

Language Features

Each subsequent section explains a different facet of the language.

Expressions, Statements, and Blocks

Expressions

Dewy is an expression based language. An expression is literally just a value, or something that evaluates to a value. Results of expressions can be stored in variables, or used to build up more complicated expressions.

The simplest type of expression is any literal value, such as an integer for instance

This expression can easily be bound to a variable

Calling a function is an expression if the function returns a value. For example, the sqrt function returns the square root of a value

And now my_expression contains the value 8.

Expressions can also be used to build up more complicated expressions

In this example, at the highest level, there is a string expression, which contains a nested expression. The nested expression sqrt(64) + 9 * cos(pi) is a mathematical expression, built up from smaller expressions combined with math operators + and *. sqrt(64) and cos(pi) are both a function call expressions, and 64, 9 are literal expressions and pi is an identifier for a constant value.

Statements

A statement is a single piece of code that expresses no value (typically referred to as void). For example calling the printl function, which prints out a string to the console

This function call doesn't return a value. If you tried to store the result into a variable, you'd get a compilation error

Most expressions in Dewy will return something, but you can easily convert an expression into a void statement by appending a semicolon ; to the end of the expression

In this example, the resulting value of each sqrt call is suppressed by the semicolon, and the array captures only the non-suppressed values, resulting in my_expression = [2 4 8].

Note: the one context where semicolon does not suppress the value of an expression is in a multidimensional array literal. In this context, semicolons are used to indicate new dimensions of the array, and values with semicolons are still captured.

Blocks

A block is just a sequence of expressions or statements wrapped in either {} or (). A block is itself an expression.

Note: the distinction between {} and () blocks has to do with the scope of the block. Any expressions inside a {} block receive a new child execution scope, while those inside a () block share the same scope as the parent where the block is written. Scope will be explained in greater detail later (TODO: link)

Let's start with the simplest type of a block, the empty block

Empty blocks have type void since they don't contain any expressions, thus making the overall block not express anything.

Adding a single expression to a block makes the block itself express that value

Adding multiple expressions to a block makes the block express multiple values (TODO: link to generators)

TODO->rest of explanation of blocks.

catching values expressed in blocks
blocks for precedence overriding
blocks work anywhere an expression is expected

Units

Dewy was designed from day 1 to include physical units such as kilogram, meter, second.

A Simple Example

Using units is quite straightforward, as they are just another expression. Juxtaposing a unit with a number will multiply the number by the unit, and you can use this to build up more complex expressions.

The energy variable now contains a value of 9000 joules. For more complex unit expressions, sometimes it is necessary to use parentheses to group terms together. In general it is good style to do so, except for the simplest unit expressions. See Operator Precedence.

Here are several more examples of unit expressions:

SI Prefixes

Note: SI prefixes only work for SI base and derived units (and a few exceptions noted below). Also the abbreviated forms of prefixes may only be combined with abbreviated units, and written out prefixes may only be combined with written out units. E.g. kilograms and kg are valid, but kgrams and kilog are invalid.

Prefix	Abbrev.	Scale
`yotta`	`Y`	10^24
`zetta`	`Z`	10^21
`exa`	`E`	10^18
`peta`	`P`	10^15
`tera`	`T`	10^12
`giga`	`G`	10^9
`mega`	`M`	10^6
`kilo`	`k`	10^3
`hecto`	`h`	10^2
`deca`	`da`	10^1
`deci`	`d`	10^−1
`centi`	`c`	10^−2
`milli`	`m`	10^−3
`micro`	`μ`/`u`	10^−6
`nano`	`n`	10^−9
`pico`	`p`	10^−12
`femto`	`f`	10^−15
`atto`	`a`	10^−18
`zepto`	`z`	10^−21
`yocto`	`y`	10^−24

Non-SI units that may receive SI prefixes:

psi (e.g. kpsi = 1000(psi))
torr (e.g. mTorr = 0.001(torr))
bar (e.g. mbar = 0.001(bar))
eV (e.g. keV = 1000(eV))
cal (e.g. kcal = 1000(cal))
(TODO: probably more)

Binary Prefixes

Note: These prefixes are exclusively for use with units of information (e.g. bit/byte)

Prefix	Abbrev.	Scale
`kibi`	`Ki`	2^10
`mebi`	`Mi`	2^20
`gibi`	`Gi`	2^30
`tebi`	`Ti`	2^40
`pebi`	`Pi`	2^50
`exbi`	`Ei`	2^60
`zebi`	`Zi`	2^70
`yobi`	`Yi`	2^80

Full List of Units

(TODO->maybe like solidworks, allow user to set unit system, e.g. meters-kilograms-seconds, centimeters-grams-seconds, etc. See: https://en.wikipedia.org/wiki/MKS_system_of_units https://en.wikipedia.org/wiki/Metre%E2%80%93tonne%E2%80%93second_system_of_units https://en.wikipedia.org/wiki/Foot%E2%80%93pound%E2%80%93second_system https://en.wikipedia.org/wiki/Centimetre%E2%80%93gram%E2%80%93second_system_of_units )

Base Units

Note: abbreviated units and prefixes are case sensitive, while fully written out units and prefixes are case insensitive

Quantity	Symbol	Abbrev. Units	Full Units
Mass*	`[M]`	`g` `k` `lbm` -	`gram`/`grams` `kilo`/`kilos` `pound-mass`/`pounds-mass` `slug`/`slugs`
Length	`[L]`	`m` - `ft` `yd` `mi` - `AU` - -	`meter`/`meters`/`metre`/`metres` `inch`/`inches` `foot`/`feet` `yard`/`yards` `mile`/`miles` `nautical_mile`/`nautical_miles` `astronomical_unit`/`astronomical_units` `light_year`/`light_years` `parsec`/`parsecs`
Time	`[T]`	`s` - - - - - - - - -	`second`/`seconds` `minute`/`minutes` `hour`/`hours` `day`/`days` `week`/`weeks` `month`/`months` `year`/`years` `decade`/`decades` `century`/`centuries` `millennium`/`millennia`
Electric Current	`[I]`	`A`	`amp`/`amps`/`ampere`/`amperes`
Thermodynamic Temperature	`[Θ]`	`K` `°R`/`°Ra` `°C` `°F`	`kelvin` `rankine`/`degrees_rankine` `celsius`/`degrees_celsius` `fahrenheit`/`degrees_fahrenheit`
Amount of Substance	`[N]`	`mol`	`mole`/`moles`
Luminous Intensity	`[J]`	`cd`	`candela`/`candelas`

(TODO: metric vs us vs etc. tons)

Note: in SI, the base unit for mass is kg/kilograms, not g/grams. k/kilo is provided as a convenience to allow for a mass base unit without a prefix. e.g. kilokilo would be equivalent to 1000(kilograms).

(TODO: exact durations of longer units. e.g. sidereal day vs solar day, etc.)

Note: the plural of kelvin is kelvin, not kelvins

Named Derived Units

Quantity	Abbrev. Units	Full Units
Plane Angle	`rad` `°`	`radian`/`radians` `degree`/`degrees`
Solid Angle	`sr`	`steradian`/`steradians`
Frequency	`Hz`	`hertz`
Force / Weight	`N` `lb`/`lbf`	`newton`/`newtons` `pound`/`pounds`/`pound-force`/`pounds-force`
Pressure / Stress	`Pa` `atm` `bar` `psi` `torr` `mmHg` `inH2O`	`pascal`/`pascals` `atmosphere`/`atmospheres` `bar` `pounds_per_square_inch` `torr` `millimeters_of_mercury` `inches_of_water`
Energy / Work / Heat	`J` `cal` `Cal`* `BTU` `eV` `Wh` `erg`	`joule`/`joules` `calorie`/`calories` `kilocalorie`/`kilocalories` `british_thermal_unit`/`british_thermal_units` `electron_volt`/`electron_volts` `watt_hour`/`watt_hours` `erg`/`ergs`
Power / Radiant Flux	`W` `hp`	`watt`/`watts` `horsepower`
Electric Charge / Quantity of Electricity	`C`	`coulomb`/`coulombs`
Voltage / Electrical Potential / EMF	`V`	`volt`/`volts`
Capacitance	`F`	`farad`/`farads`
Reistance / Impedance / Reactance	`Ω`	`ohm`/`ohms`
Electrical Conductance	`S`	`siemens`
Magnetic Flux	`Wb`	`weber`/`webers`
Magnetic Flux Density	`T`	`tesla`/`teslas`
Inductance	`H`	`henry`/`henries`
Luminous Flux	`lm`	`lumen`/`lumens`
Illuminance	`lx`	`lux`/`luxes`
Radioactivity (Decays per unit time)	`Bq`	`becquerel`/`becquerels`
Absorbed Dose (of Ionizing Radiation)	`Gy`	`gray`/`grays`
Equivalent Dose (of Ionising Radiation)	`Sv`	`sievert`/`sieverts`
Catalytic Activity	`kat`	`katal`/`katals`

Note: Cal is equivalent to kcal or kilocalorie (i.e. 1000(calories)).

Weird Units

(TODO->all other units + weird units. e.g. drops)

Other Units

Quantity	Abbrev. Units	Full Units
Information	`b`/`bit` `B`/`byte`	`bit`/`bits` `byte`/`bytes`

(TODO: where do decibels go? B is already taken by byte... perhaps the user can select what units get imported by importing units from different domains, e.g. import units from si or import units from information) (TODO: other units to add: Hz, liter, gallon, oz, phon/lufs/other sound intensity related units)

String Interpolation

Including variable values inside of a string is handled with string interpolation.

which will print the string I am 24 years old. Any arbitrary expression can be contained inside of the curly braces. For expressions that are not a string by default, the __str__ method will be called on them to get the string version.

Basic Data Types

Numeric

Numeric data is what it sounds like, values that represent a number

Numbers

Numbers are the base case for numerical values, with each subsequent type being a more specific / restricted version of the number class.

Integers

Integers are numbers that do not contain any decimal component. By default, integers can be arbitrarily large, but fixed width integers are also possible

The full list of integer types includes

Type	Description	Range
int	Arbitrary precision signed integer	`(-inf..inf)`
int8	8-bit signed integer	`[-128..127]`
int16	16-bit signed integer	`[-32768..32767]`
int32	32-bit signed integer	`[-2147483648..2147483647]`
int64	64-bit signed integer	`[-9223372036854775808..9223372036854775807]`
int128	128-bit signed integer	`[-170141183460469231731687303715884105728``..170141183460469231731687303715884105727]`
uint	Arbitrary precision unsigned integer	`[0..inf)`
uint8	8-bit unsigned integer	`[0..255]`
uint16	16-bit unsigned integer	`[0..65535]`
uint32	32-bit unsigned integer	`[0..4294967295]`
uint64	64-bit unsigned integer	`[0..18446744073709551615]`
uint128	128-bit unsigned integer	`[0..340282366920938463463374607431768211455]`

Custom Ranged Integers

You can create integer types with a custom range by specifying the range as part of the type annotation

TBD for behavior when value goes out of bounds. Perhaps result will be undefined, or there can be a type setting for wrap around

Fixed Point

Fixed point will be stored as two integers, digits and shift where the value is digits * 10^shift

TBD on the syntax for declaring a fixed point number. likely to be a function call e.g.

Rational

Rational numbers are stored as two integers, the numerator and the denominator, where the value is numerator / denominator

TBD on the syntax for declaring a rational number. likely to be a function call e.g.

Real

Real numbers are positive and negative numbers that can have a decimal component to them. The default real will be stored as a float64 i.e. a 64-bit floating point number, but other widths (and potentially arbitrary precision) are possible

Boolean

Standard true/false type

Complex

complex numbers

Quaternions

MISC.

Other datatypes (probably include on this page)

strings
symbolics
units
types
enums or tokens

Container Types

Container types are things like arrays, dictionaries, and sets. all containers are specified using square brackets [], while the contents (or other factors) determine the type of container

Arrays

An array is simple a list of values inside a container

Note: values do not need commas to separate them. Also arrays can contain objects of different types, though arrays where all values are just a single type will be more efficient. Arrays are 0-indexed (with potentially the option to set an arbitrary index)

TODO->explain how to make matrices and other linear algebra stuff.

Dictionaries

A dictionary is a list of key-value pairs

Again note the lack of a need for comma separation between key-value pairs.

Additionally if you wish, you can define a bi-directional dictionary using a double-ended arrow:

Note: when creating a bidictionary, every arrow must by double-ended. As new elements are added, the bidictionary will maintain the bidirectional links between each element. Regular dictionaries will not maintin such links.

Sets

A set is an unordered collection of elements

Objects

See the entry on Object and Class Types for more details. But breifly, an object can be created by wrapping declarations in a container

Ranges

A range represents some span over a set of values. Typically ranges will be over numbers, however any orderable set could be used for a range (e.g. strings, dates, etc.). Ranges are frequently used for loops, indexing, and several other places.

Ranges always contain a .. and may include left and or right values that are juxtaposed, optionally specifying the bounds and step size.

Range Syntax

The syntax for ranges is inspired by Haskell syntax for ranges:

Like in haskell, you can use a tuple to include a second value to specify the step size.

Note: [first..2ndlast,last] is explicitly NOT ALLOWED, as it can have unintuitive behavior, and is covered by [first,second..last].

In addition, ranges can have their bounds be inclusive or exclusive. Inclusive bounds are indicated by square brackets, and exclusive bounds are indicated by parenthesis. The default is inclusive bounds. Also left and right bounds can be specified independently, so you can have a range that is inclusive on the left and exclusive on the right, or vice versa.

Juxtaposition

The left and right values are only considered part of the range if they are juxtaposed with the ... Values not juxtaposed with the range are considered separate expressions.

Note: the range juxtaposition operator has a quite low precedence. Most operators will have higher precedence, meaning the left and right expressions don't need to be wrapped in parenthesis. However, due to range juxtapose's low precedence, any range as part of an in expression (e.g. a in [A..B]) must be wrapped in range bounds (i.e. [], (], [), ()) to ensure the range is parsed correctly. a in A..B will be parsed as (a in A)..B .

Multiple ranges can be used to index into multidimensional matrices

Numeric Ranges

Probably the most common type of range will be a numeric ranges which describes a span over real numbers.

Some simple examples include:

Ordinal Ranges

Ranges can also be constructed using any ordinal type. Currently the only only built in ordinal type other than numbers would be strings.

For example, the following range captures all characters in the range from 'a' to 'z' inclusive

All alphabetical characters might be represented like so

But I probably won't just be limited to individual characters. In principle you ought to be able to do something like this

which would create a range that consists of every possible 5 letter combination starting from the word 'apple' and iterating through to the word 'zebra'.

Note: this is distinct from every dictionary word in that range, as it will include many many gibberish words.

TDB exactly what criteria will be used for ordering strings, as I like string orderings that respect numbers embedded in them (e.g. 'apple2' should come before 'apple10'), but that becomes difficult with arbitrary strings. perhaps there might be a macro setting for the ordering type used

Range Uses

Ranges have a variety of different uses in Dewy.

Ranges in Loops

Probably the most common use is in conjunction with loops as a sequence to iterate over:

The above will print '0 1 2 3 4 5 '.

To iterate over values in reverse, you can specify a reversed range:

This prints '5 4 3 2 1 0 '.

Note: when specifying a reversed range, you must include the step size. Forgetting to specify the step size will result in an empty range, as ranges are normally increasing.

Range Arithmetic

Ranges can be used in arithmetic expressions, often as a way to construct new ranges.

Both of which will print '0 0.25 0.5 0.75 1 '

These are both equivalent to directly constructing the range [0,0.25..1], however the arithmetic versions are frequently more convenient.

This type of construction is closely related to the linspace()/logspace() functions in Dewy. TBD but linspace/logspace may in fact be implemented like so:

Compound Range Construction

Additionally you can construct more complex ranges by combining together multiple ranges:

The same range as above can be constructed using subtraction

Interval Membership

You can also check if a value falls within a specified range

Indexing Sequences

ranges can be used to select values from a sequence. For example, say we want a substring we can do the following

This works for any sequence type.

Note: only integer ranges can be used to index into sequences (TBD if this might be relaxed to real valued ranges).

Also, because array access is a juxtapose expression, we can easily make the selection inclusive or exclusive on either side.

You can specify that a range continues to the end of the sequence, or starts from the beginning by omitting the value for that side. This will construct a range that goes to infinity in that direction, which will select all elements in that direction.

Paired with the special end token which represents the index of the last element in a sequence, this provides the means to select any desired subset of a sequence.

Object and Class Types

Object types are basically just containers containing assignments of variables and functions

You can access the members of the object using the . accessor

To create an object constructor (like many languages have classes that you can instantiate) we create a function that returns an object

You can also store functions inside of objects, allowing objects to completely cover regular object behaviors from other languages

There will not be any sort of this or self parameter as in other languages to access an objects members from within itself/any contained functions. Instead because the functions are at the same scope as the declaration of the objects members, those members are available to the function.

If we remove any unnecessary syntax, the shorthand for constructing an object looks like this:

that is, just a function that returns an object literal, no need for braces or return.

Dunder Methods

Similar to python, objects can define custom so-called "double-underscore" or "dunder" methods, which hook into the language's built-in functionality.

Though actually for __add__, it might make more sense for it to be global, and you add an alternate that gets dispatched on rather than including it in the object itself:

(TODO: longer explanation)

Function Types

Functions are first class citizens in Dewy. In fact many concepts from functional programming are included in Dewy, as they frequently allow for cleaner and more concise code.

Function Literals

To create a function, simply bind a function literal to a variable

A function literal consists of the arguments, followed by the => operator, followed by a single expression that is the function body. In the above example, the function takes no input arguments, and doesn't return any values. Instead is simply prints a string to the terminal.

Here's an example that takes two arguments

In fact we can simplify the above function's declaration quite a bit since blocks return expressions present in the body.

When there is a single argument, you may omit the parenthesis around the argument list

Zero arguments functions require an empty pair of parenthesis:

Default Arguments

Function arguments can have default values, which are used if the argument is not specified in the function call.

Calling functions

TODO

calling a function with name, parenthesis, and args
functions with no arguments can omit the parenthesis

Optional, Name-only and Positional-only Arguments

TODO

also explain about overwriting previously specified arguments (e.g. from partial evaluation, or in the same call)

Scope Capture

TODO

what variables are available to a function's body

Partial Function Evaluation

First note that if you want to pass a function around as an object, you need to get a handle to the function using the @ ("handle") operator.

If you don't include the @ operator, then the evaluation of the right-hand side would be stored into the left side

what happens is my_func prints out "foo" to the command line, and then since it returns no value, reference_to_my_func is not able to be assigned, causing a compiler error. We'd also get a compiler error if my_func required arguments, as we essentially are trying to call my_func without an arguments.

Now, using the @ operator, we can not only create a new reference to an existing function, but we can also apply arguments to the reference. What this means is we can fix the value of given arguments, allowing us to create a new function.

Here we've created a new function add5 which takes a single argument, and return the result of that argument plus 5.

TODO->explain about overwriting arguments.

Operators

Dewy is a 100% expression-based language, meaning everything is formed from small pieces combined together with operators. Dewy has 3 types of operators:

unary prefix: come before the expression
binary infix: come between two expressions
unary postfix: come after the expression

Binary Operators

(TODO: this is missing several operators, and may have some extra ones that are no longer planned)

Basic Math Operations

+ plus
- minus
* multiply
/ divide
// truncated divide
mod modulus
^ exponent

logical and bitwise operations

Note: these are logical if both operands are boolean, otherwise they are bitwise and operate on as many bits as the size of the largest operand

and both are true
or either are true
xor exactly one is true
not invert (unary)
nand either is false
nor both are false
xnor both are false or both are true

bit-shift operations

<<! rotate left through carry bit
!>> rotate right through carry bit
<<< rotate left no carry bit
>>> rotate right no carry bit
<< shift left (arithmetic and logical are the same for left-shift)
>> shift right (arithmetic vs logical determined by whether signed or unsigned)

boolean returning operations

=? equal
>? greater than
>=? greater than or equal
<? less than
<=? less than or equal
in? is a member of

colon operator

: apply a type annotation

dictionary pointers

-> indicates the left expression points to the right expression in the dictionary

<-> indicates that left is inserted as a key that points to right, and right is inserted as a key that points to left

Function pointer

=> used for declaring a function literal

handle operator

@ return a handle to the function or variable
@? (probably) check if two references are point to the same thing

Assignment operators

= binds the righthand expression to the lefthand identifier as a statement (i.e. nothing is returned)
:= (walrus operator) same as normal assignment operator, but also returns the righthand side as an expression available for use.

Juxtaposition

(TODO: explain how juxtaposition works)

Unary Prefix Operators

(TODO: prefix operators)

Unary Postfix Operators

(TODO: postfix operators)

In-place Assignment

any of the logical/bitwise operators, as well as the boolean returning operators can be preceeded by not which will then cause the inverse of the operation to be returned. e.g. not and is equivalent to nand, not <? is equivalent to >=?, etc.

most binary operators can be appended with an = sign to make them into an assignment e.g.

(TODO: This should also probably be able to be combined with element-wise/vectorized . operations where each element in the list is updated according to the operation (can be done in parallel))

Elementwise Operations

the elementwise operator . can be prepended to most binary operators to make it be performed on each element in a array or sequence e.g.

This works if either the either first operand is a list, or the second is a list, or both are lists with the exact same shape

Precedence

Every operator has a precedence level, and an associativity. The precedence level determines the order in which operators in a compound expression are evaluated. Associativity determines the order of evaluation when multiple operators of the same precedence level are present in a compound expression. Associativity can be:

left-to-right
right-to-left
prefix
postfix
none (typically these expressions generate a single node in the AST)
fail (i.e. the expression is invalid if multiple operators of the same precedence level are present)

(TODO: some way of populating this table with the current full precedence table in the code)

(TODO: this table is missing several operators)

Precedence	Symbol	Name	Associativity
14	`@`	reference	prefix
13	`.` juxtapose juxtapose	access jux-call jux-index	left
12	`^`	power	right
11	juxtapose	jux-multiply	left
10	`*` `/` `%`	multiply divide modulus	left
9	`+` `-`	add subtract	left
8	`<<` `>>` `<<<` `>>>` `<<!` `!>>`	left shift right shift rotate right no carry rotate left no carry rotate left with carry rotate right with carry	left
#	`in`	in	fail
7	`=?` `>?` `<?` `>=?` `<=?`	equal greater than less than greater than or equal less than or equal	left
6	`and` `nand` `&`	and nand and	left
5	`xor` `xnor`	xor xnor	left
4	`or` `nor` \|	or nor or	left
3	`comma`	comma	none
#	juxtapose	jux-range	none
2	`=>`	function arrow	right
1	`=`	bind	fail
0	`else`	flow alternate	none
-1	space	space	left
TBD	`as`	as	TBD
TBD	`transmute`	transmute	TBD
TBD	\|>	pipe	TBD
TBD	<\|	reverse pipe	TBD
TBD	`->`	right-pointer	TBD
TBD	`<->`	bi-pointer	TBD
TBD	`<-`	left-pointer	TBD
TBD	`:`	type annotation	TBD

multi-operators e.g. 100^/2 for sqrt, or 5+-1, etc. have a precedence at the level of the first operator in the chain (i.e. all following operators have no effect on the precedence). Also any instances of elementwise operators (e.g. .+ .= .xor etc.) are at the level of precedence as the operator they're attached to. Assignment operators on the other hand are all at the same level, regardless of the type of operator they're attached to (e.g. += <?= <<= nand= etc.)

Numbers and Bases

literal numbers in various bases can be specified using prefixes before the number

Radix	Name	Prefix	Digits
2	Binary	`0b`	`[01]`
3	Ternary	`0t`	`[012]`
4	Quaternary	`0q`	`[0123]`
6	Seximal	`0s`	`[0-5]`
8	Octal	`0o`	`[0-7]`
10	Decimal*	`0d`	`[0-9]`
12	Dozenal	`0z`	`[0-9xXeE]`
16	Hexidecimal	`0x`	`[0-9A-Fa-f]`
32	Duotrigesimal	`0u`	`[0-9A-Va-v]`
36	Hexatrigesimal	`0r`	`[0-9A-Za-z]`
64	Tetrasexagesimal	`0y`	`[0-9A-Za-z!$]`

*Decimal is the default base, so the prefix is generally not necessary, unless the default base is changed.

Some examples:

Examples

Basic Math

Linear Algebra

Functional Programming

Meta Programming

Note: This will probably not be relevant until the current handwritten parser is replaced with a parser generator (probably GLL)

Eventually, the goal is for the language to be completely described via some sort of syntax description, such as a context-free grammar. There was work on this in the past, but it was paused in favor of building a usable version of the language first. When there is a suitable parser-generator implementation of the language, one of the planned features is to include the syntax description language within Dewy itself for metaprogramming purposes. Users could describe new syntax features via the metalanguage, and then be able to use them in their programs.

Here's an example of the previous work on the metalanguage:

% This is a description of the metalanguage written in the metalanguage itself

#eps = 'ϵ' | '\\e' | "''" | '""' | "{}";                    % ϵ, \e, '', "", or {} indicates empty element, i.e. nullable
#wschar = [\n\x20];                                         % ascii whitespace characters (restrict to newlines and spaces).
#line_comment = '/\/' '\n'~* / '\n'~;                       % single line comment
#block_string = ξ* - ξ* '}/';                               % inside of a block comment. Cannot end with block comment delimiter
#block_comment = '/\{' (#block_comment | #block_string)* '}/';       % block comment, with allowed nested block comments
#ws = (#wschar | #line_comment | #block_comment)*;          % optional whitespace sequence
#anyset = '\\' [uUxX] | [VUξ];                              % V, U, ξ, \U, \u, \X, or \x used to indicate any unicode character
#hex = '\\' [uUxX] [0-9a-fA-F]+ / [0-9a-fA-F];              % hex number literal. Basically skipping the number part makes it #any
#number = [0-9]+ / [0-9];                                   % decimal number literal. Used to indicate # of repetitions
#charsetchar = ξ - [\-\[\]] - #wschar;                      % characters allowed in a set are any unicode excluding '-', '[', or ']', and whitespace
#item = #charsetchar | #escape | #hex;                      % items that make up character sets, i.e. raw chars, escape chars, or hex chars
#charset = '[' (#ws #item (#ws '-' #ws #item)? #ws)+ ']';   % set of chars specified literally. Whitespace is ignored, and must be escaped.

%paired grouping operators
#group = '(' #ws #expr #ws ')';                             % group together/force precedence
#char = '"' (ξ - '"' | #escape | #hex) '"';                 % "" single character
#char = "'" (ξ - "'" | #escape | #hex) "'";                 % '' single character
#caseless_char = "{" (ξ - [{}] | #escape | #hex) "}";       % {} single character where case is ignored
#string = '"' (ξ - '"' | #escape | #hex)2+ '"';             % "" string of 2+ characters
#string = "'" (ξ - "'" | #escape | #hex)2+ "'";             % '' string of 2+ characters
#caseless_string = "{" (ξ - [{}] | #escape | #hex)2+ "}";   % {} string of 2+ characters where case is ignored for each character
#escape = '\\' ξ;                                           % an escape character. Recognized escaped characters are \n \r \t \v \b \f \a.
                                                            % all others just put the second character literally. Common literals include \\ \' \" \[ \] \-

%post operators
#capture = #expr #ws '.';                                   % group to capture
#star = #expr #ws (#number)? #ws '*';                       % zero or (number or more)
#plus = #expr #ws (#number)? #ws '+';                       % (number or one) or more
#option = #expr #ws '?';                                    % optional
#count = #expr #ws #number;                                 % exactly number of
#compliment = #set #ws '~';                                 % compliment of. equivalent to #any - #set

%implicit operators
#cat = #expr (#ws #expr)+;                                  % concatenate left and right

%binary expr operators
#or = (#expr #ws '|' #ws #expr) - #union;                   % left or right expression
#reject = (#expr #ws '-' #ws #expr) - #diff;                % reduce left expression only if it is not also the right expression
#nofollow = #expr #ws '/' #ws #expr;                        % reduce left expression only if not followed by right expression
#greaterthan = #expr #ws '>' #ws #expr;                     % left expression has higher precedence than right expression

%binary set operators
#diff = #set #ws '-' #ws #set;                              % everything in left that is not in right
#intersect = #set #ws '&' #ws #set;                         % intersect of left and right
#union = #set #ws '|' #ws #set;                             % union of left and right

%syntax constructs
#set = #anyset | #char | #caseless_char | #hex | #charset | #compliment | #diff | #intersect | #union;
#expr = #eps | #set | #group | #capture | #string | #caseless_string | #star | #plus | #option | #count | #cat | #or | #greaterthan | #lessthan | #reject | #nofollow | #hashtag;
#hashtag = '#' [a-zA-Z] [a-zA-Z0-9_]* / [a-zA-Z0-9_];
#rule = #hashtag #ws '=' #ws #expr #ws ';';
#grammar = (#ws #rule)* #ws;
#start = #grammar;

If/when a metalanguage is added to Dewy, it will likely look much different than this. Current drawbacks of this syntax are:

difficulties describing expression precedence and associativity
difficulties handling ambiguity that may arise from the grammar
verbosity of handling whitespace/comments
some incompatibility of the metalanguage with Dewy syntax. Ideally metalanguage expressions would be valid Dewy expressions
currently no process for the semantic results of parsed rules

Something more ideal may make use of string prefix functions

Or perhaps a syntax constructed explicitly from valid Dewy expressions

Flow Control

In Dewy, the two main methods of conditionally executing code are if and loop expressions.

If Expressions

If expressions allow you to conditionally evaluate code based on whether or not some condition is met.

The syntax for an if expression is:

where <condition> must result in a boolean value, and <expression> can be anything. Commonly, <expression> will be a block containing multiple expressions.

Loop Expressions

Loop expressions allow you to repeat the execution of some code while some condition is met.

The syntax for a loop expression is:

where <condition> must be an expression that evaluates to a boolean value, and <expression> can be anything.

Loops will be explored in more detail in One Loop To Rule Them All.

Flow Chains

Multiple flow expressions can be chained together via the else operator, along with an optional final default case that need not be a flow expression. In normal languages, this would be if-else-if kinds of sequences, which are certainly possible in Dewy:

or even

But Dewy also allows loop expressions to be combined in this way as well:

In the above example, if a is greater than b, the first block would be executed, and the rest of the blocks are skipped. If a is less than b, the loop in the second block executes, incrementing a until it is equal to b, at which point the rest of the chain is skipped. Only if a is neither greater than nor less than b (i.e. a equals b) will the final block be executed exclusively.

(TODO: add an example for how all conditions share the same scope, so variables defined in one condition will be available in later bodies if they execute)

(TODO: probably add a finally operator which can be used to always execute code at the end)

Capturing Values

Unlike if statements from other languages, ifs and loops in Dewy are themselves expressions, allowing any expressed values to be captured. if expressions basically act like Dewy's version of the ternary operator

my_var would have a value of 'a tropical fruit' at the end of the above example.

Values from loops can be captured to construct sequences, which is explored more in One Loop To Rule Them All.

Match Expressions

(TODO) switch statement equivalent

Break, Continue, Return,

(TODO) branch inside body of conditional

Advanced Flow Control

(TODO) combining conditionals (TODO) list and dictionary generators

One Loop to Rule them All

Other languages make use of sometimes multiple keywords such as for, while, do-while, for each, etc., to handle looping over a piece of code. Dewy instead simply uses the loop keyword to handle all forms of looping.

Syntactically, loops are quite simple. The syntax for a loop is:

Where <condition> must result in a boolean determines if the loop continues, and <expression> which can be anything is executed each time the loop repeats.

The various types of loops seen in other languages are formed simply by changing the <condition> part of the loop.

Infinite Loops

An infinite loop is one that never ends. They are constructed by hardcoding the condition to true, which ensures that the loop will always repeat.

The only way to leave an infinite loop is via the break, and return keywords.

While Loops

A while loop is a loop that executes "while" some condition is true. A simple boolean expression can be used as the condition. When the condition is false, the loop ends.

For Loops

Many languages feature for loops, which iterate over some iterable object. The simplest case of this would be iterating over a range of numbers. In Dewy, the in operator manages iteration for loops. in has two aspects:

the variable on the left is assigned with the next value of the iterable on the right (or void if there are no more values)
the expression returns true or false depending on if there was a value to assign to the variable this iteration.

This means that in expressions can be used to trivially construct a for-loop.

Which prints out 1, 2, 3, 4, 5, to the console. Each iteration, in causes i to be assigned the next value in the sequence, while returning true for the loop condition. When the sequence is exhausted, in returns false, and the loop ends.

For loops can also iterate over the items in any type of container. Iterating over a list looks like this

Which prints the following to the console

I like to eat apple
I like to eat banana
I like to eat peach
I like to eat pear

Items in a dictionary can be iterated over like so

This takes advantage of the fact that iterating over a dictionary returns each pair, which can then be unpacked into separate variables show and rating. This prints out the following to the console

I give star wars a 73 out of 100
I give star trek a 89 out of 100
I give star gate a 84 out of 100
I give battlestar galactica a 87 out of 100
I give legend of the galactic heroes a 100 out of 100

Multiple Conditions

A neat side effect of in statements returning a boolean is that it provides a free method for looping over multiple sequences simultaneously. Simply combine two in statements with a logical operator. The loop will continue until one sequence is exhausted, both, or something else, depending on which logical operator is used. The behavior zip from other languages can be achieved by combining sequences with and

In this case, the loop runs until either sequence is exhausted (as and requires both conditions to be true, so as soon as one is false, the loop ends). This prints out the following to the console

Alice chose Red
Bob chose Blue
Charlie chose Green

Other languages commonly have an enumerate function which will count how many iterations have occurred on top of looping over some sequence. This can be achieved by combining an infinite range with any sequence using and:

i will never run out of values, so the loop continues so long as fruit has values remaining. This prints out the following to the console

0) apple
1) banana
2) peach
3) pear

Using the or operator to combine sequences will loop until so long as either of the sequences have values remaining.

Which prints

[1 4]
[2 5]
[3 6]
[undefined 7]
[undefined 8]

Since this is just combining boolean expressions, any combination of expressions that results in a boolean may be used.

As a similar approach, perhaps there might be iterators in this style where they don't have a fixed value, but instead track over some changing resource

Do Loop Do

The do-while version of the loop can be constructed by putting the do keyword before the body, and putting the loop keyword and its condition after the body. This means loop body is executed at least once before the condition is checked, at which point the loop could exit or continue.

Basic do-while loop:

do-loop over an iterator. On the first iteration, i will be undefined, while it will be available on subsequent iterations

Which prints

this is a do-for loop. i=undefined
this is a do-for loop. i=0
this is a do-for loop. i=1
this is a do-for loop. i=2
this is a do-for loop. i=3
this is a do-for loop. i=4
this is a do-for loop. i=5

Technically you can construct an infinite do-while loop, but it's basically identical to a regular infinite loop

Lastly you can sandwich loop between two blocks using two do keywords (one before the first block, and one after the loop condition). This will give you a block executed before the condition and a block executed after the condition

Note: the syntax for do-loop-do is still being finalized, and may change from this example

In this loop, the first block is guaranteed to execute at least once. Then we check the condition, and then if true, we execute the second block, then repeat the loop, execute the first block, and then check the condition again, repeating until the condition is false, or we have iterated over all elements.

Break, Continue, Return inside Loops

TODO->write this. follows basic principles of other languages. extra is that you can use #hashtags to break/continue from inside nested loops

Loop Generators

Let's look at this example

Every iteration of the loop, the current value of i is "expressed", that is to say, the value could be stored in a variable or a container.

Lets capture the expressed value in a container by wrapping the loop in [] brackets

This "generates" the array [1 2 3 4 5 6 7 8 9 10], which we can then store into a variable

And thus we have created the simplest list generator.

Multiple Expressions per Iteration

Generators can do a lot of interesting things. For example we can express multiple values on a single loop iteration

producing the array [1 1 2 4 3 9 4 16 5 25].

We can also construct a dictionary by expressing with a -> between two values

which produces the dictionary [1->1 2->4 3->9 4->16 5->25] which points from values to their squares.

Multidimensional Generators

You can generate a multidimensional array using multiple nested loops. For example

which produces the following 3D array representing the indices of a 5x5 matrix as tuples

indices = [
    [[1 1] [1 2] [1 3] [1 4] [1 5]]
    [[2 1] [2 2] [2 3] [2 4] [2 5]]
    [[3 1] [4 2] [3 3] [3 4] [3 5]]
    [[4 1] [4 2] [4 3] [4 4] [4 5]]
    [[5 1] [5 2] [5 3] [5 4] [5 5]]
]

And so many more things are possible. Loop generators are about as flexible a feature as one could imagine. It's really up to you how you want to apply them

Imports

TODO

Syntax likely to change

Examples of imports

Standard Library

Dewy is 100% batteries included, and provides a comprehensive standard library over most common programming facets.

Note: This is a work in progress. The standard library is not yet implemented. For now this is just a record of the planned standard library features.

Standard Library Reference

Data Structures

Time

TODO:

timezones
calendars
- gregorian
- julian
- Human Era

Plotting

Work in progress

Types of plots to include out of the box:

ridgeline plot
sankey plots (should be buildable directly from graph network data structure/etc.)
TODO: more

Animating plots

TODO

Blocking vs non-blocking plots

TODO

Parsing

TODO

library methods for parsing given some grammar+ specification (probably base on GLL and/or formalize the existing dewy parsing process into a nice library)
able to generate tree-sitter compatible parsers implemented in C
probably also have easy hooks into language server protocol and common language features
should be easy to do regex like things too (i.e. since parsers are necessarily more powerful than regex, I'm more referring to ease of creating a simple parser should be as easy as making a regex)
- id_matcher = parser'[a-zA-Z_][a-zA-Z0-9_]*'

High Performance Parallelism

TODO: want strong support for high performance parallelism (see discussion with gemini)

CPU work stealing algorithm
user provides functions for: work split, minimum work, result recombining
GPU task support, user provides kernel
complect dependent task support via some sort of dependency graph
high performance distributed support as well. probably completely different set of primitives
note gemini recommended including simpler concurrency tools like mutexes+channels since some simple non-parallel but concurrent problems are not well suited for the more advanced tools like work-stealing dequeues

Gemini Discussion Outline

Design Document: A Tiered Architecture for High-Performance Parallelism

1. Guiding Principle: Abstracting Complexity via Tiered Abstractions

The core philosophy is to provide programmers with abstractions that match the structure of their problem, while hiding the complex, error-prone mechanics of the underlying hardware. The language will offer a tiered set of tools, from simple "it just works" parallelism for common cases to expert-level control for specialized needs. The default path will always be the safest and most abstract.

2. Tier 1: Foundational CPU Parallelism - The Work-Stealing Scheduler

This is the engine that will power most CPU-bound parallel operations. Its implementation is internal to the language runtime and not directly exposed to the programmer.

Application: General-purpose CPU-bound tasks, fork-join parallelism, parallel loops.
Implementation Strategy:
- Per-Core Deques: On startup, the runtime creates a pool of worker threads, pinning each to a physical CPU core. Each thread is assigned a double-ended queue (deque).
- LIFO/FIFO Discipline:
  - A worker thread pushes new sub-tasks (fork) and pops its own work (join) from the bottom of its deque (LIFO). This is a private, non-atomic, cache-friendly operation.
  - When a worker runs out of local work, it becomes a "thief" and attempts to pop a task from the top of another randomly chosen worker's deque (FIFO).
- Lock-Free Stealing: The "steal" operation on the top of the deque must be implemented using a lock-free algorithm, relying on a hardware-level Compare-And-Swap (CAS) atomic instruction to safely manage the head pointer in the face of concurrent thieves.
- Global Injection Queue: A single, global, concurrent Multi-Producer/Multi-Consumer (MPMC) queue will exist for submitting initial work from outside the thread pool (e.g., from the main thread or an I/O thread). Workers will check this queue before attempting to steal.

3. Tier 2: High-Level Parallel Constructs (The Programmer's Interface)

These are the primary tools programmers will use. They are built on top of the Tier 1 scheduler.

3.1. Embarrassingly Parallel Operations

Application: Image processing, scientific computing, data transformation, any problem that can be modeled as "do the same thing to every element in a collection."
Feature: Parallel Iterators

Interface (Pseudocode):

// A standard collection type
interface Collection<T> {
  // Returns a standard sequential iterator
  iter(): Iterator<T>
  // Returns a parallel iterator, the gateway to parallelism
  par_iter(): ParallelIterator<T>
}

interface ParallelIterator<T> {
  // Executes a function for each element in parallel.
  // Blocks until all work is complete.
  for_each(func: (T) -> void): void

  // Creates a new parallel collection by applying a function to each element.
  map<U>(func: (T) -> U): ParallelCollection<U>

  // Reduces the collection to a single value in parallel.
  reduce(identity: T, op: (T, T) -> T): T
}

Implementation Strategy: A call to par_iter().for_each(...) is not a simple loop. The implementation recursively splits the collection's range in half. When a range is large enough, it forks the processing of one half as a new task onto the current worker's deque and recursively processes the other half itself. This naturally feeds the work-stealing scheduler.

3.2. Complex Dependency Graphs

Application: Compilers, build systems, game engines, any workflow with irregular, dynamic dependencies.
Feature: Futures and Asynchronous Tasks

Interface (Pseudocode):

// A Future is a handle to a value that may not be ready yet.
interface Future<T> {
  // Blocks the current *logical task* (not thread) until the value is ready.
  // A worker thread that awaits will drop this task and steal another.
  await(): T

  // Checks if the value is ready without blocking.
  is_ready(): bool
}

// The core scheduler interface for dependent tasks.
interface Scheduler {
  // Schedules a function to run. Returns a Future to its result immediately.
  // The task becomes runnable as soon as its dependencies are met.
  schedule<T>(
    task_func: () -> T,
    dependencies: optional Collection<Future<any>>
  ): Future<T>
}

Implementation Strategy:
1. The Scheduler.schedule function creates a Task object containing the function pointer and a list of dependency Futures. The Task is placed in a central graph and marked Waiting.
2. When a Future is completed, the scheduler iterates through its dependents. If a dependent Task has all its dependencies met, its state is changed to Runnable and it's pushed to the global injection queue.
3. A call to Future.await() within a task's logic is a compiler/runtime intrinsic. It registers the current task as Waiting on the target Future and signals the worker thread to immediately return to the scheduler to find a new Runnable task. It must not block the OS thread.

4. Tier 3: Heterogeneous & Distributed Computing

This tier acknowledges that not all hardware is the same and that computation may span multiple machines. It builds on the concepts of Tiers 1 and 2 but adapts them for different constraints.

4.1. GPU Co-Processing

Application: Highly data-parallel sub-problems within a larger computation (e.g., specific compiler passes, linear algebra, image filtering).
Feature: Execution Policies and Kernel Abstraction

Interface (Pseudocode):

// Extend parallel iterators with an execution policy.
// The programmer hints at the desired execution environment.
enum ExecutionPolicy { CPU, GPU_ACCELERATED }

interface ParallelIterator<T> {
  // New `for_each` with a policy.
  for_each(policy: ExecutionPolicy, func: (T) -> void): void
}

Implementation Strategy:
- When ExecutionPolicy.GPU_ACCELERATED is used, the runtime attempts to compile the body of the func lambda into a GPU kernel (e.g., SPIR-V or PTX).
- It inserts boilerplate code to manage memory transfers: CPU RAM -> GPU VRAM, kernel execution, and GPU VRAM -> CPU RAM.
- This is a "leaky abstraction": the programmer is responsible for ensuring the func body is GPU-friendly (no divergent control flow, etc.) and that the data size justifies the transfer overhead.
- Indirect Execution: For advanced cases (like game engines), provide IndirectDraw commands that allow the CPU to tell the GPU to chain its own operations without CPU readback, using GPU-side buffers as command arguments.

4.2. Distributed Tasks

Application: Big data processing (e.g., Apache Spark), large-scale scientific simulations.
Feature: Resilient Distributed Datasets/DataFrames and Remote Tasks

Interface (Pseudocode):

// Abstraction for a collection partitioned across a cluster.
interface DistributedCollection<T> {
  // Transformations build a lineage graph but don't execute yet (lazy).
  map<U>(func: (T) -> U): DistributedCollection<U>
  filter(func: (T) -> bool): DistributedCollection<T>

  // Actions trigger the actual computation across the cluster.
  reduce(op: (T, T) -> T): Future<T>
  collect(): Future<Collection<T>>
}

// All functions passed to transformations must be serializable.
// The language must provide a `Serializable` trait/interface.

Implementation Strategy:
1. Lazy Evaluation & Lineage: Calls to map and filter do not execute. They build a logical Directed Acyclic Graph (DAG) of the computation's "lineage".
2. Job Scheduler: A call to an action (reduce, collect) triggers the central scheduler. It analyzes the DAG, optimizes it, and breaks it into stages of coarse-grained, serializable tasks.
3. Data Locality: The scheduler queries a cluster manager to find where data partitions live and attempts to send computation to the data.
4. Fault Tolerance: If a worker node dies, the scheduler uses the lineage graph to re-compute the lost data partitions by re-running the necessary tasks on another available node.
5. RPC & Serialization: The runtime must have a robust, built-in RPC mechanism and a default serialization protocol.

5. Conclusion: A Graduated Approach to Parallelism

By structuring the language features in these tiers, we empower all levels of programmers. Novices can achieve significant speedups safely with par_iter. Experts can build complex, real-time systems like game engines using the Future-based tasking system. And data scientists can scale their analyses to massive clusters, all while thinking in terms of the same core concept: a graph of dependent computations. The language's role is to provide the safest, most abstract tool that can effectively solve the problem at hand.

Why keep less performant parallelism/concurrency primitives

That's an exceptionally sharp and provocative design question. It gets right to the heart of language philosophy: Should a language be a "pit of success" by only providing the best tools, or should it be a comprehensive toolbox that includes simpler, sharper tools for specific jobs?

You are advocating for a highly opinionated language design, and there's a strong argument for it. However, removing mutexes entirely would have significant, and likely negative, consequences.

Let's analyze the pros and cons of this "no mutexes, only high-level parallelism" approach.

The Argument FOR Removing Mutexes (Your Position)

Forces Good Habits: By removing the "easy" but often non-performant and deadlock-prone option, the language forces programmers to structure their problems in a way that is amenable to modern parallel patterns (e.g., message passing, structured parallelism). It nudges everyone towards thinking about data ownership and dependency graphs.
Reduces Language Surface Area: Fewer concurrency primitives mean less for the language maintainers to implement, document, and perfect. It also means less for a new programmer to learn and potentially misuse.
Drives API Excellence: As you said, if the only tools are the high-level ones, there is immense pressure to make their APIs incredibly polished, ergonomic, and well-documented, because they must serve everyone from the beginner to the expert.
Avoids Performance Traps: Prevents programmers from building a system that works fine with 2 threads, but then grinds to a halt with 16 threads because of lock contention. They are forced to use a scalable solution from day one.

The Strong Argument AGAINST Removing Mutexes

Despite the compelling points above, here's why virtually every language—even those with state-of-the-art parallelism like Rust and Go—keeps mutexes and similar primitives around.

1. Not All Concurrency is About Parallelism

This is the most crucial distinction.

Parallelism is about doing lots of work faster by using multiple cores. This is where work-stealing shines.
Concurrency is about managing lots of tasks at the same time, which may or may not be CPU-bound.

Consider the Motivating Example: The Global Cache.

In our application, we had a Mutex<Map<...>> for a shared picture cache. Let's analyze this problem:

Is it a performance bottleneck? No. Accessing this cache is infrequent. A thread might grab the lock for a few microseconds once every few seconds. There is virtually zero contention.
Is it a parallelism problem? No. We don't need to speed up the cache access itself. We just need to ensure that when two threads happen to access it at the same time, they do so safely without corrupting the map.

How would you solve this with only high-level tools?

Message Passing / Actor Model: You could create a dedicated "cache actor" thread that owns the map. Any other thread wanting to access the cache would have to send an asynchronous message ({ type: 'get', key: '...' }) and wait for a response message on another channel.

This would work, but it's enormously overwrought for this problem. You've replaced a simple, clear mutex.lock(...) with a dedicated thread, two channels, and an asynchronous message-passing protocol. The cognitive overhead and boilerplate are significantly higher for a problem that a mutex solves perfectly and efficiently.

2. Protecting Inevitable "Oddball" Shared State

Sometimes, you just have a weird, global piece of state that needs to be managed. Initializing a logger, updating a small configuration value, registering a plugin. These are not data-parallel problems. They are simple state-change problems that require synchronization. A mutex is the most direct and readable tool for this job. Forcing it into a complex parallel framework is using a sledgehammer to crack a nut.

3. Interacting with the Outside World (FFI)

When your language needs to call into a C library (Foreign Function Interface), that library might require you to hold a lock before calling certain functions. Your language must provide a primitive that can satisfy this external requirement. You can't tell the C library to adopt a work-stealing model.

4. The "Escape Hatch" is Crucial

High-level abstractions are great, but they aren't perfect for every niche scenario. Sometimes, an expert programmer needs to build a new, custom synchronization primitive or data structure. The fundamental building blocks for those are almost always low-level atomics and, yes, mutexes or condition variables. Removing them takes away the power to build the next generation of high-level abstractions.

Conclusion: The Better Path is "Safe by Default"

Instead of removing mutexes, the modern language design philosophy is to make them safe to use and less appealing for problems they don't solve well.

This is the Rust approach:

Make Mutexes Safe: Mutex<T> is integrated with the ownership system. You cannot forget to unlock it (the lock is released when the guard object goes out of scope). You cannot access the data without holding the lock (this is a compile-time error). This eliminates many common mutex bugs.
Make Parallelism Easy and Fast: Provide incredibly ergonomic and performant tools like Rayon's par_iter().

When a programmer is faced with a choice:

For a big for loop, par_iter() is a one-line change and is clearly the right tool.
For a simple shared cache, wrapping it in a Mutex is straightforward and obviously correct.

The language doesn't forbid the simpler tool; it just makes the high-performance tool so easy and attractive for the right problems that people naturally gravitate towards it. It guides them to the "pit of success" without taking away a fundamental and sometimes necessary tool.

So, your hypothetical language should absolutely not promote mutexes for high-performance computation. But it should provide a safe, well-designed Mutex as a fundamental tool for managing simple shared state, because that is a distinct and valid problem class.

General Concepts

Sandboxes and Harnesses

say you want to simulate running your program to test it, but without it actually having any side effects outside of the program. Sandboxes/Harnesses make this trivial. (TODO)

when developing apps that have side effects outside of the program (e.g. file system, web traffic, etc.), sandboxes and harnesses provide a convenient way to test the program without it actually affecting anything outside of the program.
should probably be some sort of command line argument when running a program. Can also provide code that determines the sandbox/harness response for various actions (e.g. handle what happens when the user tries to open a file, or when the program tries to send a web request, etc.). There should also be good default implementations for all aspects of the sandbox/harness.
can also harness portions of an application while other aspects have not yet been developed. For example, if you are developing a new page on a large web app, and it itself interacts with parts that aren't implemented, you can create a harness that simulates those aspects as if they were implemented.
harnesses can simulate different operating systems, etc.

Sandboxes/Harnesses make it so testing code is identical to running in production

Advanced Case Studies

This is a collection of real-world use cases demonstrating language usage

Dewy Compiler

The bootstrapped compiler for the language

tokenizing
parsing
type checking
code gen
etc.

Asteroid Detection

Self-contained example for detecting Near Earth Objects (NEOs) and Asteroids

collect raw data from Vera Rubin Observatory
analyze data and identify NEOs/Asteroids
calculate orbital parameters for all identified objects
plot objects in a simple 3D visualization (playing object orbits over time)