!learningrustandlemmy@lemmy.ml
A collaborative space for people to work together on learning Rust, learning about the Lemmy code base, discussing whatever confusions or difficulties we're having in these endeavours, and solving problems, including, hopefully, some contributions back to the Lemmy code base.
Rules TL;DR: Be nice, constructive, and focus on learning and working together on understanding Rust and Lemmy.
See also:
Thumbnail and banner generated by ChatGPT.
!learningrustandlemmy
@lemmy.mlWelcome to week 30 of Reading Club for Rust’s “The Book” (“The Rust Programming Language”).
“The Reading”
Chapter 19 (continued):
https://rust-book.cs.brown.edu/ch18-03-pattern-syntax.html
(the special Brown University version with quizzes etc)
The Twitch Stream
Starting today within the hour @sorrybookbroke@sh.itjust.works twitch stream on this chapter: https://www.twitch.tv/deerfromsmoke
https://www.youtube.com/watch?v=ou2c5J6FmsM&list=PL5HV8OVwY_F9gKodL2S31czb7UCwOAYJL (YouTube Playlist)
Be sure to catch future streams (will/should be weekly: https://www.twitch.tv/deerfromsmoke)
https://blog.sylver.dev/build-your-own-sqlite-part-1-listing-tables
As developers, we use databases all the time. But how do they work? In this series, we'll try to answer that question by building our own SQLite-compatible database from scratch. Source code examples will be provided in Rust, but you are encouraged t...
Having read through the macros section of "The Book" (Chapter 19.6), I thought I would try to hack together a simple idea using macros as a way to get a proper feel for them.
The chapter was a little light, and declarative macros (using macro_rules!
), which is what I'll be using below, seemed like a potentially very nice feature of the language ... the sort of thing that really makes the language malleable. Indeed, in poking around I've realised, perhaps naively, that macros are a pretty common tool for rust devs (or at least more common than I knew).
I'll rant for a bit first, which those new to rust macros may find interesting or informative (it's kinda a little tutorial) ... to see the implementation, go to "Implementation (without using a macro)" heading and what follows below.
Well, "declarative macros" (with macro_rules!
) were pretty useful I found and easy to get going with (such that it makes perfect sense that they're used more frequently than I thought).
rust-analyzer
LSP
understand what you're emitting perfectly well in my experience. It really felt properly native to rust.Use macro_rules!
to declare a new macro
Yep, it's also a macro!
Create a structure just like a match expression
Write a pattern as just rust code with "generic code fragment" elements
$
like so: $GENERIC_CODE
.$GENERIC_CODE:expr
block
expr
is used for expressionsident
is used for variable/function namesitem
literal
is used for literal constantspat
(pattern)path
stmt
(statement)tt
(token tree)ty
(type)vis
(visibility qualifier)So a basic pattern that matches on any struct
while capturing the struct
's name, its only field's name, and its type would be:
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty
}
)
}
Now, $name
, $field
and $field_type
will be captured for any single-field struct
(and, presumably, the validity of the syntax enforced by the "fragment specifiers").
Capture any repeated patterns with +
or *
regex
$( ... )
$( ... ),
$( ... ),+
So, to capture multiple fields in a struct
(expanding from the example above):
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty,
$( $ff:ident : $ff_type: ty),*
}
)
}
Use =>
as with match expressions
=> { ... }
, IE with braces (not sure why)Write the new emitted code
$GENERIC_CODE
.$( ... ),*
, including the separator (which can be different from the one used in the capture).
For example, we could convert the struct
to an enum
where each field became a variant with an enclosed value of the same type as the struct
:
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty,
$( $ff:ident : $ff_type: ty),*
}
) => {
enum $name {
$field($field_type),
$( $ff($ff_type) ),*
}
}
}
With the above macro defined ... this code ...
my_new_macro! {
struct Test {
a: i32,
b: String,
c: Vec<String>
}
}
... will emit this code ...
enum Test {
a(i32),
b(String),
c(Vec<String>)
}
Basically ... a simple system for custom types to represent physical units.
A basic pattern I've sometimes implemented on my own (without bothering with dependencies that is) is creating some basic representation of physical units in the type system. Things like meters or centimetres and degrees or radians etc.
If your code relies on such and performs conversions at any point, it is way too easy to fuck up, and therefore worth, IMO, creating some safety around. NASA provides an obvious warning. As does, IMO, common sense and experience: most scientists and physical engineers learn the importance of "dimensional analysis" of their calculations.
In fact, it's the sort of thing that should arguably be built into any language that takes types seriously (like eg rust). I feel like there could be an argument that it'd be as reasonable as the numeric abstractions we've worked into programming??
At the bottom I'll link whatever crates I found for doing a better job of this in rust (one of which seemed particularly interesting).
The essential design is (again, this is basic):
#[derive(Debug)]
pub enum TimeUnits {s, ms, us, }
#[derive(Debug)]
pub struct Time {
pub value: f64,
pub unit: TimeUnits,
}
impl Time {
pub fn new<T: Into<f64>>(value: T, unit: TimeUnits) -> Self {
Self {value: value.into(), unit}
}
fn unit_conv_val(unit: &TimeUnits) -> f64 {
match unit {
TimeUnits::s => 1.0,
TimeUnits::ms => 0.001,
TimeUnits::us => 0.000001,
}
}
fn conversion_factor(&self, unit_b: &TimeUnits) -> f64 {
Self::unit_conv_val(&self.unit) / Self::unit_conv_val(unit_b)
}
pub fn convert(&self, unit: TimeUnits) -> Self {
Self {
value: (self.value * self.conversion_factor(&unit)),
unit
}
}
}
So, we've got:
enum
TimeUnits
representing the various units of time we'll be usingstruct
Time
that will be any given value
of "time" expressed in any given unit
match expression
on the new unit that hardcodes the conversions (relative to base unit of seconds ... see the conversion_factor()
method which generalises the conversion values).Note: I'm using T: Into<f64>
for the new()
method and f64
for Time.value
as that is the easiest way I know to accept either integers or floats as values. It works because i32
(and most other numerics) can be converted lossless-ly to f64
.
Obviously you can go further than this. But the essential point is that each unit needs to be a new type with all the desired functionality implemented manually or through some handy use of blanket trait implementations
For something pretty basic, the above is an annoying amount of boilerplate!! May as well rely on a dependency!?
Well, we can write the boilerplate once in a macro and then only provide the informative parts!
In the case of the above, the only parts that matter are:
struct
enum
type we'll use (as they'll flag units throughout the codebase)IE, for the above, we only need to write something like:
struct Time {
value: f64,
unit: TimeUnits,
s: 1.0,
ms: 0.001,
us: 0.000001
}
Note: this isn't valid rust! But that doesn't matter, so long as we can write a pattern that matches it and emit valid rust from the macro, it's all good! (Which means we can write our own little DSLs with native macros!!)
To capture this, all we need are what we've already done above: capture the first two fields and their types, then capture the remaining "field names" and their values in a repeating pattern.
The pattern
macro_rules! unit_gen {
(
struct $name:ident {
$v:ident: f64,
$u:ident: $u_enum:ident,
$( $un:ident : $value:expr ),+
}
)
}
expr
after it, despite being invalid rust.The Full Macro
macro_rules! unit_gen {
(
struct $name:ident {
$v:ident: f64,
$u:ident: $u_enum:ident,
$( $un:ident : $value:expr ),+
}
) => {
#[derive(Debug)]
pub struct $name {
pub $v: f64,
pub $u: $u_enum,
}
impl $name {
fn unit_conv_val(unit: &$u_enum) -> f64 {
match unit {
$(
$u_enum::$un => $value
),+
}
}
fn conversion_factor(&self, unit_b: &$u_enum) -> f64 {
Self::unit_conv_val(&self.$u) / Self::unit_conv_val(unit_b)
}
pub fn convert(&self, unit: $u_enum) -> Self {
Self {
value: (self.value * self.conversion_factor(&unit)),
unit
}
}
}
#[derive(Debug)]
pub enum $u_enum {
$( $un ),+
}
}
}
Note the repeating capture is used twice here in different ways.
$( $un:ident : $value:expr ),+
And in the emitted code:
unit_conv_val
method as: $( $u_enum::$un => $value ),+
ident
$un
is being used as the variant of the enum
that is defined later in the emitted code$u_enum
is also used without issue, as the name/type of the enum
, despite not being part of the repeated capture but another variable captured outside of the repeated fragments.$( $un ),+
Now all of the boilerplate above is unnecessary, and we can just write:
unit_gen!{
struct Time {
value: f64,
unit: TimeUnits,
s: 1.0,
ms: 0.001,
us: 0.000001
}
}
Usage from main.rs
:
use units::Time;
use units::TimeUnits::{s, ms, us};
fn main() {
let x = Time{value: 1.0, unit: s};
let y = x.convert(us);
println!("{:?}", x);
println!("{:?}", x);
}
Output:
Time { value: 1.0, unit: s }
Time { value: 1000000.0, unit: us }
struct
and enum
created by the emitted code is properly available from the module as though it were written manually or directly.rust-analyzer
) was able to autocomplete these immediately once the macro was written and called.I did a brief search for actual units systems and found the following
dimnesioned
[seconds, meters, kgs, amperes, kelvins, moles, candelas]
, then the Newton
, m.kg / s^2
would be [-2, 1, 1, 0, 0, 0, 0]
.Unfortunately, I'm not sure if the repository is still maintained.
Interestingly, F#
actually has a system built in!
F#
herehttps://pony.social/@thephd/112818744298401332
A lot of people think I'm being sarcastic here, which is fair because I only went toe-to-toe against people on Twitter and didn't do much here, so I'll state my full opinion below anyhow: I would agree with anyone about not wanting to replace C (or C++). But, C has been alive for 50 years (or just 35 from C89) and Rust has been alive for just barely under 10 (since Rust 1.0). Even if you measure the last 10 years of Rust versus the last 10 years of C or C++, one of these languages is making leaps and bounds ahead in providing people better primitives to do good work. SafeInt secured pretty much all of Microsoft Office from some of the hardest bugs back in, around, 2005. C++ still lacks safe integer primitives; C only just got 3 functions to do overflow-checked math in C23, after David Svoboda campaigned for years. Rust just... has them baked into the standard library, for all the types you care about, too. Similarly, people have been having memory issues in C and C++ for a while too. Most of the way to get better has been clamping down on static analysis and doing more testing, but we're still getting these errors. Meanwhile, teams writing Rust have been making way less errors on this in all the openly-published data from corporations like Google, and privately we are hearing a lot more about people taking complex financial and parsing code and turning it into Rust and having a fraction of the issues. Even if I want to see C doing better, I have to acknowledge we were (a) too slow and not brave enough to do the things that could fix these portions of the language; (b) have fundamental design issues in the language itself that make ownership impossible to integrate as part of the language without breaking a ton of code; (c) do not provide good in-language tools and keep depending on vendors to "do the right thing" (i.e. adding or expanding U.B. and then just saying "vendors will check it" rather than taking responsibility with our language design); (d) are moving monumentally too slow to address the needs of the industry that many people -- especially security people -- have been yelling about since the mid 90s. As much as I just want to pretend that I can write off every developer with "haha lole skill issue test better sanitize better IDIOT", if the root cause on this bug is "there was some C and/or C++ code that looked nominally correct but did batshit insanity in production", we absolutely will have problems to answer for. This doesn't absolve CrowdStrike for cutting 100s of workers and playing fast and loose, this doesn't excuse the fact that hospitals went down and people likely dead from lack of access to care, this doesn't change that it's abhorrent to have unmitigated hardware access in Ring0 just for a "security product", which has been the trend of every app wanting to plug in its own RootKit-like tool just for the sake of "app security" lately (League, NProtect, School Exam Spyware, etc.). There's a LOT of levels of "what the fuck have we let happen?" in play here, but I don't control those other levels. I'm responsible for C, so I'm gonna look at the C bit. Other people responsible for the other parts of this stack should, hopefully, take sincere responsibility for those parts. (I doubt it, though, lmao.)
https://cliffle.com/p/dangerust/
Just a quick riff/hack on whether it'd be hard to make a collect()
method that "collected" into a Vec
without needing any turbofish (see, if you're interested, my prior post on the turbofish
.
Some grasp of traits and iteration is required to comfortably get this ... though it might be a fun dive even if you're not
collect
The implementation of collect
is:
fn collect<B: FromIterator<Self::Item>>(self) -> B
where
Self: Sized,
{
FromIterator::from_iter(self)
}
The generic type B
is bound by FromIterator
which basically enables a type to be constructed from an Iterator
. In other words, collect()
returns any type that can be built from an interator. EG, Vec
.
The reason the turbofish
comes about is that, as I said above, it returns "any type" that can be built from an iterator. So when we run something like:
let z = [1i32, 2, 3].into_iter().collect();
... we have a problem ... rust, or the collect()
method has no idea what type we're building/constructing.
More specifically, looking at the code for collect
, in the call of FromIterator::form_iter(self)
, which is calling the method on the trait directly, rust has no way to determine which implementation of the trait to use. The one on Vec
or HashMap
or String
etc??
Thus, the turbofish
syntax specifies the generic type B
which (somehow through type inference???) then determines which implementation to use.
let z = [1i32, 2, 3].into_iter().collect::<Vec<_>>();
IE: Use the implementation on Vec
!
Vec
?I figure Vec
is used so often as the type for collecting an Iterator
that it could be nice to have a convenient method.
The docs even hint at this by suggesting that calling the FromIterator::from_iter()
method directly from the desired type (eg Vec
) can be more readable (see FromIterator
docs).
EG ... using collect
:
let d = [1i32, 2, 3];
let x = d.iter().map(|x| x + 100).collect::<Vec<_>>();
Using Vec::from_iter()
let y = Vec::from_iter(d.iter().map(|x| x + 100));
As Vec
is always in the prelude (IE, it's always available), using from_iter
clearly seems like a nicer option here.
But you lose method chaining! So ... how about a method on Iterator
, like collect
but for Vec
specifically? How would you make that and is it hard??
collect_vec()
It's not hard actually
CollectVec
that defines a method collect_vec
which returns Vec<Self::Item>
Iterator
(or, make Iterator
the "supertrait") so that the Iterator::collect()
method is always availableCollectVec
for all types that implement Iterator
by just calling self.collect()
... the type inference will take care of the rest, because it's clear that a Vec
will be used.trait CollectVec: Iterator {
fn collect_vec(self) -> Vec<Self::Item>;
}
impl<I: Iterator> CollectVec for I {
fn collect_vec(self) -> Vec<Self::Item> {
self.collect()
}
}
With this you can then do the following:
let d = [1i32, 2, 3];
let d2 = d.iter().map(|x| x + 1).collect_vec();
Don't know about you, but implementing such methods for the common collection types would suit me just fine ... that turbofish is a pain to write ... and AFAICT this isn't inconsistent with rust's style/design. And it's super easy to implement ... the type system handles this issue very well.
https://doc.rust-lang.org/stable/std/f32/consts/index.html
Basic mathematical constants.
https://doc.rust-lang.org/std/thread/fn.spawn.html
Spawns a new thread, returning a `JoinHandle` for it.
I'll be clear, quite embarrassingly I bit my tongue hard last night and haven't been talking right all day. Hurts to talk, hurts to eat, and worst of all hot tea is undrinkable. How will I live. Now I know exactly what it feels like to be soldier wounded in combat.
Will resume next week in full force. In the meantime however please feel free to read ahead. Or, alternatively, try out a few leetcode\advent of code questions. This what I'll be doing tonight.
https://turbo.fish/