•

Rust for Lemmings Reading Club Week 30 [PROJECT]

Welcome to week 30 of Reading Club for Rust’s “The Book” (“The Rust Programming Language”).

“The Reading”

Chapter 19 (continued):
https://rust-book.cs.brown.edu/ch18-03-pattern-syntax.html (the special Brown University version with quizzes etc)

The Twitch Stream

Starting today within the hour @sorrybookbroke@sh.itjust.works twitch stream on this chapter: https://www.twitch.tv/deerfromsmoke

https://www.youtube.com/watch?v=ou2c5J6FmsM&list=PL5HV8OVwY_F9gKodL2S31czb7UCwOAYJL (YouTube Playlist)

Be sure to catch future streams (will/should be weekly: https://www.twitch.tv/deerfromsmoke)

0 comments

Learning Rust and Lemmy

maegul

•

Build your own SQLite (in rust), Part 1: Listing tables

Build your own SQLite, Part 1: Listing tables

https://blog.sylver.dev/build-your-own-sqlite-part-1-listing-tables

As developers, we use databases all the time. But how do they work? In this series, we'll try to answer that question by building our own SQLite-compatible database from scratch. Source code examples will be provided in Rust, but you are encouraged t...

2 comments

Learning Rust and Lemmy

maegul

•

So ... macros are fun!! (a bit of rant, maybe a kinda tutorial, and a quick hack)

Intro

Having read through the macros section of "The Book" (Chapter 19.6), I thought I would try to hack together a simple idea using macros as a way to get a proper feel for them.

The chapter was a little light, and declarative macros (using macro_rules!), which is what I'll be using below, seemed like a potentially very nice feature of the language ... the sort of thing that really makes the language malleable. Indeed, in poking around I've realised, perhaps naively, that macros are a pretty common tool for rust devs (or at least more common than I knew).

I'll rant for a bit first, which those new to rust macros may find interesting or informative (it's kinda a little tutorial) ... to see the implementation, go to "Implementation (without using a macro)" heading and what follows below.

Using a macro

Well, "declarative macros" (with macro_rules!) were pretty useful I found and easy to get going with (such that it makes perfect sense that they're used more frequently than I thought).

It's basically pattern matching on arbitrary code and then emitting new code through a templating-like mechanism (pretty intuitive).
The type system and rust-analyzer LSP understand what you're emitting perfectly well in my experience. It really felt properly native to rust.

The Elements of writing patterns with "Declarative macros"

Use macro_rules! to declare a new macro

Yep, it's also a macro!

Create a structure just like a match expression

Except the pattern will match on the code provided to the new macro
... And uses special syntax for matching on generic parts or fragments of the code
... And it returns new code (not an expression or value).

Write a pattern as just rust code with "generic code fragment" elements

You write the code you're going to match on, but for the parts that you want to capture as they will vary from call to call, you specify variables (or more technically, "metavariables").
- You can think of these as the "arguments" of the macro. As they're the parts that are operated on while the rest is literally just static text/code.
These variables will have a name and a type.
The name as prefixed with a dollar sign $ like so: $GENERIC_CODE.
And it's type follows a colon as in ordinary rust: $GENERIC_CODE:expr
- These types are actually syntax specifiers. They specify what part of rust syntax will appear in the fragment.
- Presumably, they link right back into the rust parser and are part of how these macros integrate pretty seamlessly with the type system and borrow checker or compiler.
- Here's a decent list from rust-by-example (you can get a full list in the rust reference on macro "metavariables"):
  - block
  - expr is used for expressions
  - ident is used for variable/function names
  - item
  - literal is used for literal constants
  - pat (pattern)
  - path
  - stmt (statement)
  - tt (token tree)
  - ty (type)
  - vis (visibility qualifier)

So a basic pattern that matches on any struct while capturing the struct's name, its only field's name, and its type would be:

macro_rules! my_new_macro {
    (
        struct $name:ident {
            $field:ident: $field_type:ty
        }
    )
}

Now, $name, $field and $field_type will be captured for any single-field struct (and, presumably, the validity of the syntax enforced by the "fragment specifiers").

Capture any repeated patterns with + or *

Yea, just like regex
Wrap the repeated pattern in $( ... )
Place whatever separating code that will occur between the repeats after the wrapping parentheses:
- EG, a separating comma: $( ... ),
Place the repetition counter/operator after the separator: $( ... ),+

Example

So, to capture multiple fields in a struct (expanding from the example above):

macro_rules! my_new_macro {
    (
        struct $name:ident {
            $field:ident: $field_type:ty,
            $( $ff:ident : $ff_type: ty),*
        }
    )
}

This will capture the first field and then any additional fields.
- The way you use these repeats mirrors the way they're captured: they all get used in the same way and rust will simply repeat the new code for each repeated captured.

Writing the emitted or new code

Use => as with match expressions

Actually, it's => { ... }, IE with braces (not sure why)

Write the new emitted code

All the new code is simply written between the braces
Captured "variables" or "metavariables" can be used just as they were captured: $GENERIC_CODE.
Except types aren't needed here
Captured repeats are expressed within wrapped parentheses just as they were captured: $( ... ),*, including the separator (which can be different from the one used in the capture).
- The code inside the parentheses can differ from that captured (that's the point after all), but at least one of the variables from the captured fragment has to appear in the emitted fragment so that rust knows which set of repeats to use.
- A useful feature here is that the repeats can be used multiple times, in different ways in different parts of the emitted code (the example at the end will demonstrate this).

Example

For example, we could convert the struct to an enum where each field became a variant with an enclosed value of the same type as the struct:

macro_rules! my_new_macro {
    (
        struct $name:ident {
            $field:ident: $field_type:ty,
            $( $ff:ident : $ff_type: ty),*
        }
    ) => {
        enum $name {
            $field($field_type),
            $( $ff($ff_type) ),*
        }
    }
}

With the above macro defined ... this code ...

my_new_macro! {
    struct Test {
        a: i32,
        b: String,
        c: Vec<String>
    }
}

... will emit this code ...

enum Test {
    a(i32),
    b(String),
    c(Vec<String>)
}

Application: "The code" before making it more efficient with a macro

Basically ... a simple system for custom types to represent physical units.

The Concept (and a rant)

A basic pattern I've sometimes implemented on my own (without bothering with dependencies that is) is creating some basic representation of physical units in the type system. Things like meters or centimetres and degrees or radians etc.

If your code relies on such and performs conversions at any point, it is way too easy to fuck up, and therefore worth, IMO, creating some safety around. NASA provides an obvious warning. As does, IMO, common sense and experience: most scientists and physical engineers learn the importance of "dimensional analysis" of their calculations.

In fact, it's the sort of thing that should arguably be built into any language that takes types seriously (like eg rust). I feel like there could be an argument that it'd be as reasonable as the numeric abstractions we've worked into programming??

At the bottom I'll link whatever crates I found for doing a better job of this in rust (one of which seemed particularly interesting).

Implementation (without using a macro)

The essential design is (again, this is basic):

A single type for a particular dimension (eg time or length)
Method(s) for converting between units of that dimension
Ideally, flags or constants of some sort for the units (thinking of enum variants here)
- These could be methods too

#[derive(Debug)]
pub enum TimeUnits {s, ms, us, }

#[derive(Debug)]
pub struct Time {
    pub value: f64,
    pub unit: TimeUnits,
}

impl Time {
    pub fn new<T: Into<f64>>(value: T, unit: TimeUnits) -> Self {
        Self {value: value.into(), unit}
    }

    fn unit_conv_val(unit: &TimeUnits) -> f64 {
        match unit {
            TimeUnits::s => 1.0,
            TimeUnits::ms => 0.001,
            TimeUnits::us => 0.000001,
        }
    }

    fn conversion_factor(&self, unit_b: &TimeUnits) -> f64 {
        Self::unit_conv_val(&self.unit) / Self::unit_conv_val(unit_b)
    }

    pub fn convert(&self, unit: TimeUnits) -> Self {
        Self {
            value: (self.value * self.conversion_factor(&unit)),
            unit
        }
    }
}

So, we've got:

An enum TimeUnits representing the various units of time we'll be using
A struct Time that will be any given value of "time" expressed in any given unit
With methods for converting from any units to any other unit, the heart of which being a match expression on the new unit that hardcodes the conversions (relative to base unit of seconds ... see the conversion_factor() method which generalises the conversion values).

Note: I'm using T: Into<f64> for the new() method and f64 for Time.value as that is the easiest way I know to accept either integers or floats as values. It works because i32 (and most other numerics) can be converted lossless-ly to f64.

Obviously you can go further than this. But the essential point is that each unit needs to be a new type with all the desired functionality implemented manually or through some handy use of blanket trait implementations

Defining a macro instead

For something pretty basic, the above is an annoying amount of boilerplate!! May as well rely on a dependency!?

Well, we can write the boilerplate once in a macro and then only provide the informative parts!

In the case of the above, the only parts that matter are:

The name of the type/struct
The name of the units enum type we'll use (as they'll flag units throughout the codebase)
The names of the units we'll use and their value relative to the base unit.

IE, for the above, we only need to write something like:

struct Time {
    value: f64,
    unit: TimeUnits,
    s: 1.0,
    ms: 0.001,
    us: 0.000001
}

Note: this isn't valid rust! But that doesn't matter, so long as we can write a pattern that matches it and emit valid rust from the macro, it's all good! (Which means we can write our own little DSLs with native macros!!)

To capture this, all we need are what we've already done above: capture the first two fields and their types, then capture the remaining "field names" and their values in a repeating pattern.

Implementation of the macro

The pattern

macro_rules! unit_gen {
    (
        struct $name:ident {
            $v:ident: f64,
            $u:ident: $u_enum:ident,
            $( $un:ident : $value:expr ),+
        }
    )
}

Note the repeating fragment doesn't provide a type for the field, but instead captures and expression expr after it, despite being invalid rust.

The Full Macro

macro_rules! unit_gen {
    (
        struct $name:ident {
            $v:ident: f64,
            $u:ident: $u_enum:ident,
            $( $un:ident : $value:expr ),+
        }
    ) => {
        #[derive(Debug)]
        pub struct $name {
            pub $v: f64,
            pub $u: $u_enum,
        }
        impl $name {
            fn unit_conv_val(unit: &$u_enum) -> f64 {
                match unit {
                $(
                    $u_enum::$un => $value
                ),+
                }
            }
            fn conversion_factor(&self, unit_b: &$u_enum) -> f64 {
                Self::unit_conv_val(&self.$u) / Self::unit_conv_val(unit_b)
            }
            pub fn convert(&self, unit: $u_enum) -> Self {
                Self {
                    value: (self.value * self.conversion_factor(&unit)),
                    unit
                }
            }
        }
        #[derive(Debug)]
        pub enum $u_enum {
            $( $un ),+
        }
    }
}

Note the repeating capture is used twice here in different ways.

The capture is: $( $un:ident : $value:expr ),+

And in the emitted code:

It is used in the unit_conv_val method as: $( $u_enum::$un => $value ),+
- Here the ident $un is being used as the variant of the enum that is defined later in the emitted code
- Where $u_enum is also used without issue, as the name/type of the enum, despite not being part of the repeated capture but another variable captured outside of the repeated fragments.
It is then used in the definition of the variants of the enum: $( $un ),+
- Here, only one of the captured variables is used, which is perfectly fine.

Usage

Now all of the boilerplate above is unnecessary, and we can just write:

unit_gen!{
    struct Time {
        value: f64,
        unit: TimeUnits,
        s: 1.0,
        ms: 0.001,
        us: 0.000001
    }
}

Usage from main.rs:

use units::Time;
use units::TimeUnits::{s, ms, us};

fn main() {

    let x = Time{value: 1.0, unit: s};
    let y = x.convert(us);

    println!("{:?}", x);
    println!("{:?}", x);
}

Output:

Time { value: 1.0, unit: s }
Time { value: 1000000.0, unit: us }

Note how the struct and enum created by the emitted code is properly available from the module as though it were written manually or directly.
In fact, my LSP (rust-analyzer) was able to autocomplete these immediately once the macro was written and called.

Crates for unit systems

I did a brief search for actual units systems and found the following

`dimnesioned`

dimensioned documentation

Easily the most interesting to me (from my quick glance), as it seems to have created the most native and complete representation of physical units in the type system
It creates, through types, a 7-dimensional space, one for each SI base unit
This allows all possible units to be represented as a reduction to a point in this space.
- EG, if the dimensions are [seconds, meters, kgs, amperes, kelvins, moles, candelas], then the Newton, m.kg / s^2 would be [-2, 1, 1, 0, 0, 0, 0].
This allows all units to be mapped directly to this consistent representation (interesting!!), and all operations to then be done easily and systematically.

Unfortunately, I'm not sure if the repository is still maintained.

uom

uom documentation

This might actually be good too, I just haven't looked into it much
It also seems to be currently maintained

F#

Interestingly, F# actually has a system built in!

See learning documentation on F# here
Also this older (2008) series of blogs on the feature here

14 comments

Learning Rust and Lemmy

maegul

•

Hot takes on the state of Rust v C/C++ for safety (mastodon cross post)

Björkus "No time_t to Die" Dorkus (@thephd@pony.social)

https://pony.social/@thephd/112818744298401332

A lot of people think I'm being sarcastic here, which is fair because I only went toe-to-toe against people on Twitter and didn't do much here, so I'll state my full opinion below anyhow: I would agree with anyone about not wanting to replace C (or C++). But, C has been alive for 50 years (or just 35 from C89) and Rust has been alive for just barely under 10 (since Rust 1.0). Even if you measure the last 10 years of Rust versus the last 10 years of C or C++, one of these languages is making leaps and bounds ahead in providing people better primitives to do good work. SafeInt secured pretty much all of Microsoft Office from some of the hardest bugs back in, around, 2005. C++ still lacks safe integer primitives; C only just got 3 functions to do overflow-checked math in C23, after David Svoboda campaigned for years. Rust just... has them baked into the standard library, for all the types you care about, too. Similarly, people have been having memory issues in C and C++ for a while too. Most of the way to get better has been clamping down on static analysis and doing more testing, but we're still getting these errors. Meanwhile, teams writing Rust have been making way less errors on this in all the openly-published data from corporations like Google, and privately we are hearing a lot more about people taking complex financial and parsing code and turning it into Rust and having a fraction of the issues. Even if I want to see C doing better, I have to acknowledge we were (a) too slow and not brave enough to do the things that could fix these portions of the language; (b) have fundamental design issues in the language itself that make ownership impossible to integrate as part of the language without breaking a ton of code; (c) do not provide good in-language tools and keep depending on vendors to "do the right thing" (i.e. adding or expanding U.B. and then just saying "vendors will check it" rather than taking responsibility with our language design); (d) are moving monumentally too slow to address the needs of the industry that many people -- especially security people -- have been yelling about since the mid 90s. As much as I just want to pretend that I can write off every developer with "haha lole skill issue test better sanitize better IDIOT", if the root cause on this bug is "there was some C and/or C++ code that looked nominally correct but did batshit insanity in production", we absolutely will have problems to answer for. This doesn't absolve CrowdStrike for cutting 100s of workers and playing fast and loose, this doesn't excuse the fact that hospitals went down and people likely dead from lack of access to care, this doesn't change that it's abhorrent to have unmitigated hardware access in Ring0 just for a "security product", which has been the trend of every app wanting to plug in its own RootKit-like tool just for the sake of "app security" lately (League, NProtect, School Exam Spyware, etc.). There's a LOT of levels of "what the fuck have we let happen?" in play here, but I don't control those other levels. I'm responsible for C, so I'm gonna look at the C bit. Other people responsible for the other parts of this stack should, hopefully, take sincere responsibility for those parts. (I doubt it, though, lmao.)

8 comments

Learning Rust and Lemmy

maegul

•

Learn Rust the Dangerous Way - Cliffle

https://cliffle.com/p/dangerust/

0 comments

Learning Rust and Lemmy

maegul

•

Making a `collect_vec` method/trait?

Just a quick riff/hack on whether it'd be hard to make a collect() method that "collected" into a Vec without needing any turbofish (see, if you're interested, my prior post on the turbofish.

Some grasp of traits and iteration is required to comfortably get this ... though it might be a fun dive even if you're not

Background on `collect`

The implementation of collect is:

fn collect<B: FromIterator<Self::Item>>(self) -> B
where
    Self: Sized,
{
    FromIterator::from_iter(self)
}

The generic type B is bound by FromIterator which basically enables a type to be constructed from an Iterator. In other words, collect() returns any type that can be built from an interator. EG, Vec.

The reason the turbofish comes about is that, as I said above, it returns "any type" that can be built from an iterator. So when we run something like:

let z = [1i32, 2, 3].into_iter().collect();

... we have a problem ... rust, or the collect() method has no idea what type we're building/constructing.

More specifically, looking at the code for collect, in the call of FromIterator::form_iter(self), which is calling the method on the trait directly, rust has no way to determine which implementation of the trait to use. The one on Vec or HashMap or String etc??

Thus, the turbofish syntax specifies the generic type B which (somehow through type inference???) then determines which implementation to use.

let z = [1i32, 2, 3].into_iter().collect::<Vec<_>>();

IE: Use the implementation on Vec!

Why not just use `Vec`?

I figure Vec is used so often as the type for collecting an Iterator that it could be nice to have a convenient method.

The docs even hint at this by suggesting that calling the FromIterator::from_iter() method directly from the desired type (eg Vec) can be more readable (see FromIterator docs).

EG ... using collect:

let d = [1i32, 2, 3];
let x = d.iter().map(|x| x + 100).collect::<Vec<_>>();

Using Vec::from_iter()

let y = Vec::from_iter(d.iter().map(|x| x + 100));

As Vec is always in the prelude (IE, it's always available), using from_iter clearly seems like a nicer option here.

But you lose method chaining! So ... how about a method on Iterator, like collect but for Vec specifically? How would you make that and is it hard??

Making `collect_vec()`

It's not hard actually

Define a trait, CollectVec that defines a method collect_vec which returns Vec<Self::Item>
Make this a "sub-trait" of Iterator (or, make Iterator the "supertrait") so that the Iterator::collect() method is always available
Implement CollectVec for all types that implement Iterator by just calling self.collect() ... the type inference will take care of the rest, because it's clear that a Vec will be used.

trait CollectVec: Iterator {
    fn collect_vec(self) -> Vec<Self::Item>;
}

impl<I: Iterator> CollectVec for I {
    fn collect_vec(self) -> Vec<Self::Item> {
        self.collect()
    }
}

With this you can then do the following:

let d = [1i32, 2, 3];
let d2 = d.iter().map(|x| x + 1).collect_vec();

Don't know about you, but implementing such methods for the common collection types would suit me just fine ... that turbofish is a pain to write ... and AFAICT this isn't inconsistent with rust's style/design. And it's super easy to implement ... the type system handles this issue very well.

2 comments

Learning Rust and Lemmy

maegul

•

`std::f32::consts` - Rust (module for mathematical constants)

std::f32::consts - Rust

https://doc.rust-lang.org/stable/std/f32/consts/index.html

Basic mathematical constants.

0 comments

Learning Rust and Lemmy

maegul

•

How to think about function trait bounds (specifically that for `thread::spawn()`)?

spawn in std::thread - Rust

https://doc.rust-lang.org/std/thread/fn.spawn.html

Spawns a new thread, returning a `JoinHandle` for it.

4 comments

Learning Rust and Lemmy

sorrybookbroke

•

Rust for lemmings reading club stream canceled this week.

I'll be clear, quite embarrassingly I bit my tongue hard last night and haven't been talking right all day. Hurts to talk, hurts to eat, and worst of all hot tea is undrinkable. How will I live. Now I know exactly what it feels like to be soldier wounded in combat.

Will resume next week in full force. In the meantime however please feel free to read ahead. Or, alternatively, try out a few leetcode\advent of code questions. This what I'll be doing tonight.

4 comments

Learning Rust and Lemmy

maegul

•

`::<>` ... the story of the turbofish in rust

::<>

https://turbo.fish/

2 comments

Learning Rust and Lemmy

Rust for Lemmings Reading Club Week 30 [PROJECT]

Build your own SQLite (in rust), Part 1: Listing tables

Build your own SQLite, Part 1: Listing tables

So ... macros are fun!! (a bit of rant, maybe a kinda tutorial, and a quick hack)

Intro

Using a macro

The Elements of writing patterns with "Declarative macros"

Example

Writing the emitted or new code

Example

Application: "The code" before making it more efficient with a macro

The Concept (and a rant)

Implementation (without using a macro)

Defining a macro instead

Implementation of the macro

Usage

Crates for unit systems

dimnesioned

uom

F#

Hot takes on the state of Rust v C/C++ for safety (mastodon cross post)

Björkus "No time_t to Die" Dorkus (@thephd@pony.social)

Learn Rust the Dangerous Way - Cliffle

Learn Rust the Dangerous Way - Cliffle

Making a `collect_vec` method/trait?

Background on collect

Why not just use Vec?

Making collect_vec()

`std::f32::consts` - Rust (module for mathematical constants)

std::f32::consts - Rust

How to think about function trait bounds (specifically that for `thread::spawn()`)?

spawn in std::thread - Rust

Rust for lemmings reading club stream canceled this week.

`::<>` ... the story of the turbofish in rust

::<>

learningrustandlemmy

Welcome

Running Projects

Policies and Purposes

Rules

Relevant links and Related Communities

Learning Rust and Lemmy

Rust for Lemmings Reading Club Week 30 [PROJECT]

Build your own SQLite (in rust), Part 1: Listing tables

Build your own SQLite, Part 1: Listing tables

So ... macros are fun!! (a bit of rant, maybe a kinda tutorial, and a quick hack)

Intro

Using a macro

The Elements of writing patterns with "Declarative macros"

Example

Writing the emitted or new code

Example

Application: "The code" before making it more efficient with a macro

The Concept (and a rant)

Implementation (without using a macro)

Defining a macro instead

Implementation of the macro

Usage

Crates for unit systems

dimnesioned

uom

F#

Hot takes on the state of Rust v C/C++ for safety (mastodon cross post)

Björkus "No time_t to Die" Dorkus (@thephd@pony.social)

Learn Rust the Dangerous Way - Cliffle

Learn Rust the Dangerous Way - Cliffle

Making a `collect_vec` method/trait?

Background on collect

Why not just use Vec?

Making collect_vec()

`std::f32::consts` - Rust (module for mathematical constants)

std::f32::consts - Rust

How to think about function trait bounds (specifically that for `thread::spawn()`)?

spawn in std::thread - Rust

Rust for lemmings reading club stream canceled this week.

`::<>` ... the story of the turbofish in rust

::<>

`dimnesioned`

Background on `collect`

Why not just use `Vec`?

Making `collect_vec()`

`dimnesioned`

Background on `collect`

Why not just use `Vec`?

Making `collect_vec()`