https://blog.sylver.dev/build-your-own-sqlite-part-1-listing-tables
As developers, we use databases all the time. But how do they work? In this series, we'll try to answer that question by building our own SQLite-compatible database from scratch. Source code examples will be provided in Rust, but you are encouraged t...
Having read through the macros section of "The Book" (Chapter 19.6), I thought I would try to hack together a simple idea using macros as a way to get a proper feel for them.
The chapter was a little light, and declarative macros (using macro_rules!
), which is what I'll be using below, seemed like a potentially very nice feature of the language ... the sort of thing that really makes the language malleable. Indeed, in poking around I've realised, perhaps naively, that macros are a pretty common tool for rust devs (or at least more common than I knew).
I'll rant for a bit first, which those new to rust macros may find interesting or informative (it's kinda a little tutorial) ... to see the implementation, go to "Implementation (without using a macro)" heading and what follows below.
Well, "declarative macros" (with macro_rules!
) were pretty useful I found and easy to get going with (such that it makes perfect sense that they're used more frequently than I thought).
rust-analyzer
LSP
understand what you're emitting perfectly well in my experience. It really felt properly native to rust.Use macro_rules!
to declare a new macro
Yep, it's also a macro!
Create a structure just like a match expression
Write a pattern as just rust code with "generic code fragment" elements
$
like so: $GENERIC_CODE
.$GENERIC_CODE:expr
block
expr
is used for expressionsident
is used for variable/function namesitem
literal
is used for literal constantspat
(pattern)path
stmt
(statement)tt
(token tree)ty
(type)vis
(visibility qualifier)So a basic pattern that matches on any struct
while capturing the struct
's name, its only field's name, and its type would be:
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty
}
)
}
Now, $name
, $field
and $field_type
will be captured for any single-field struct
(and, presumably, the validity of the syntax enforced by the "fragment specifiers").
Capture any repeated patterns with +
or *
regex
$( ... )
$( ... ),
$( ... ),+
So, to capture multiple fields in a struct
(expanding from the example above):
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty,
$( $ff:ident : $ff_type: ty),*
}
)
}
Use =>
as with match expressions
=> { ... }
, IE with braces (not sure why)Write the new emitted code
$GENERIC_CODE
.$( ... ),*
, including the separator (which can be different from the one used in the capture).
For example, we could convert the struct
to an enum
where each field became a variant with an enclosed value of the same type as the struct
:
macro_rules! my_new_macro {
(
struct $name:ident {
$field:ident: $field_type:ty,
$( $ff:ident : $ff_type: ty),*
}
) => {
enum $name {
$field($field_type),
$( $ff($ff_type) ),*
}
}
}
With the above macro defined ... this code ...
my_new_macro! {
struct Test {
a: i32,
b: String,
c: Vec<String>
}
}
... will emit this code ...
enum Test {
a(i32),
b(String),
c(Vec<String>)
}
Basically ... a simple system for custom types to represent physical units.
A basic pattern I've sometimes implemented on my own (without bothering with dependencies that is) is creating some basic representation of physical units in the type system. Things like meters or centimetres and degrees or radians etc.
If your code relies on such and performs conversions at any point, it is way too easy to fuck up, and therefore worth, IMO, creating some safety around. NASA provides an obvious warning. As does, IMO, common sense and experience: most scientists and physical engineers learn the importance of "dimensional analysis" of their calculations.
In fact, it's the sort of thing that should arguably be built into any language that takes types seriously (like eg rust). I feel like there could be an argument that it'd be as reasonable as the numeric abstractions we've worked into programming??
At the bottom I'll link whatever crates I found for doing a better job of this in rust (one of which seemed particularly interesting).
The essential design is (again, this is basic):
#[derive(Debug)]
pub enum TimeUnits {s, ms, us, }
#[derive(Debug)]
pub struct Time {
pub value: f64,
pub unit: TimeUnits,
}
impl Time {
pub fn new<T: Into<f64>>(value: T, unit: TimeUnits) -> Self {
Self {value: value.into(), unit}
}
fn unit_conv_val(unit: &TimeUnits) -> f64 {
match unit {
TimeUnits::s => 1.0,
TimeUnits::ms => 0.001,
TimeUnits::us => 0.000001,
}
}
fn conversion_factor(&self, unit_b: &TimeUnits) -> f64 {
Self::unit_conv_val(&self.unit) / Self::unit_conv_val(unit_b)
}
pub fn convert(&self, unit: TimeUnits) -> Self {
Self {
value: (self.value * self.conversion_factor(&unit)),
unit
}
}
}
So, we've got:
enum
TimeUnits
representing the various units of time we'll be usingstruct
Time
that will be any given value
of "time" expressed in any given unit
match expression
on the new unit that hardcodes the conversions (relative to base unit of seconds ... see the conversion_factor()
method which generalises the conversion values).Note: I'm using T: Into<f64>
for the new()
method and f64
for Time.value
as that is the easiest way I know to accept either integers or floats as values. It works because i32
(and most other numerics) can be converted lossless-ly to f64
.
Obviously you can go further than this. But the essential point is that each unit needs to be a new type with all the desired functionality implemented manually or through some handy use of blanket trait implementations
For something pretty basic, the above is an annoying amount of boilerplate!! May as well rely on a dependency!?
Well, we can write the boilerplate once in a macro and then only provide the informative parts!
In the case of the above, the only parts that matter are:
struct
enum
type we'll use (as they'll flag units throughout the codebase)IE, for the above, we only need to write something like:
struct Time {
value: f64,
unit: TimeUnits,
s: 1.0,
ms: 0.001,
us: 0.000001
}
Note: this isn't valid rust! But that doesn't matter, so long as we can write a pattern that matches it and emit valid rust from the macro, it's all good! (Which means we can write our own little DSLs with native macros!!)
To capture this, all we need are what we've already done above: capture the first two fields and their types, then capture the remaining "field names" and their values in a repeating pattern.
The pattern
macro_rules! unit_gen {
(
struct $name:ident {
$v:ident: f64,
$u:ident: $u_enum:ident,
$( $un:ident : $value:expr ),+
}
)
}
expr
after it, despite being invalid rust.The Full Macro
macro_rules! unit_gen {
(
struct $name:ident {
$v:ident: f64,
$u:ident: $u_enum:ident,
$( $un:ident : $value:expr ),+
}
) => {
#[derive(Debug)]
pub struct $name {
pub $v: f64,
pub $u: $u_enum,
}
impl $name {
fn unit_conv_val(unit: &$u_enum) -> f64 {
match unit {
$(
$u_enum::$un => $value
),+
}
}
fn conversion_factor(&self, unit_b: &$u_enum) -> f64 {
Self::unit_conv_val(&self.$u) / Self::unit_conv_val(unit_b)
}
pub fn convert(&self, unit: $u_enum) -> Self {
Self {
value: (self.value * self.conversion_factor(&unit)),
unit
}
}
}
#[derive(Debug)]
pub enum $u_enum {
$( $un ),+
}
}
}
Note the repeating capture is used twice here in different ways.
$( $un:ident : $value:expr ),+
And in the emitted code:
unit_conv_val
method as: $( $u_enum::$un => $value ),+
ident
$un
is being used as the variant of the enum
that is defined later in the emitted code$u_enum
is also used without issue, as the name/type of the enum
, despite not being part of the repeated capture but another variable captured outside of the repeated fragments.$( $un ),+
Now all of the boilerplate above is unnecessary, and we can just write:
unit_gen!{
struct Time {
value: f64,
unit: TimeUnits,
s: 1.0,
ms: 0.001,
us: 0.000001
}
}
Usage from main.rs
:
use units::Time;
use units::TimeUnits::{s, ms, us};
fn main() {
let x = Time{value: 1.0, unit: s};
let y = x.convert(us);
println!("{:?}", x);
println!("{:?}", x);
}
Output:
Time { value: 1.0, unit: s }
Time { value: 1000000.0, unit: us }
struct
and enum
created by the emitted code is properly available from the module as though it were written manually or directly.rust-analyzer
) was able to autocomplete these immediately once the macro was written and called.I did a brief search for actual units systems and found the following
dimnesioned
[seconds, meters, kgs, amperes, kelvins, moles, candelas]
, then the Newton
, m.kg / s^2
would be [-2, 1, 1, 0, 0, 0, 0]
.Unfortunately, I'm not sure if the repository is still maintained.
Interestingly, F#
actually has a system built in!
F#
hereI looked around and struggled to find out what it does?
My guess would be that it notifies you of when new posts are made to communities you subscribe to. But that sounds like a lot, so I'm really not sure.
Otherwise, is it me or does the wording here not speak for itself?
Generally, the lens I've come to criticise any/all fediverse projects is how well they foster community building. One reason why I like and "advocate" for the lemmy/threadiverse side of things is precisely because of this and how the centrality of the community/sub/group is a good way of organising social media (IMO).
Also, because of that, I recently came to be skeptical of the effects that the "All" feed can have. I didn't even realise that people relied mostly on the All feed until recently.
I think I've reached the point now of being against it (at least tentatively). I know, it's a staple and there's no way it's going away. And I know it's useful.
But thinking about the feature set, through the community building lens, I think it'd be fair to say that things are out of balance: they don't promote community building enough while also providing the All feed which dissolves community building.
Not really a criticism of the developers ... AFAIU, the All feed is easier to implement than any other community building feature ... and it's expected from reddit (though it isn't normal on forums AFAICT, which is maybe worth considering for anyone happy to reassess what about reddit is retained and what isn't).
But still, I can imagine a platform that is more focused on communities:
A possibly interesting and frustrating aspect of all of these suggestions/ideas above is I can see their federation being problematic or difficult ... which raises the issue of whether there's serious tension between platform design and protocol capabilities.
https://www.youtube.com/watch?v=QoSdJB4D3Fc
Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.
https://pony.social/@thephd/112818744298401332
A lot of people think I'm being sarcastic here, which is fair because I only went toe-to-toe against people on Twitter and didn't do much here, so I'll state my full opinion below anyhow: I would agree with anyone about not wanting to replace C (or C++). But, C has been alive for 50 years (or just 35 from C89) and Rust has been alive for just barely under 10 (since Rust 1.0). Even if you measure the last 10 years of Rust versus the last 10 years of C or C++, one of these languages is making leaps and bounds ahead in providing people better primitives to do good work. SafeInt secured pretty much all of Microsoft Office from some of the hardest bugs back in, around, 2005. C++ still lacks safe integer primitives; C only just got 3 functions to do overflow-checked math in C23, after David Svoboda campaigned for years. Rust just... has them baked into the standard library, for all the types you care about, too. Similarly, people have been having memory issues in C and C++ for a while too. Most of the way to get better has been clamping down on static analysis and doing more testing, but we're still getting these errors. Meanwhile, teams writing Rust have been making way less errors on this in all the openly-published data from corporations like Google, and privately we are hearing a lot more about people taking complex financial and parsing code and turning it into Rust and having a fraction of the issues. Even if I want to see C doing better, I have to acknowledge we were (a) too slow and not brave enough to do the things that could fix these portions of the language; (b) have fundamental design issues in the language itself that make ownership impossible to integrate as part of the language without breaking a ton of code; (c) do not provide good in-language tools and keep depending on vendors to "do the right thing" (i.e. adding or expanding U.B. and then just saying "vendors will check it" rather than taking responsibility with our language design); (d) are moving monumentally too slow to address the needs of the industry that many people -- especially security people -- have been yelling about since the mid 90s. As much as I just want to pretend that I can write off every developer with "haha lole skill issue test better sanitize better IDIOT", if the root cause on this bug is "there was some C and/or C++ code that looked nominally correct but did batshit insanity in production", we absolutely will have problems to answer for. This doesn't absolve CrowdStrike for cutting 100s of workers and playing fast and loose, this doesn't excuse the fact that hospitals went down and people likely dead from lack of access to care, this doesn't change that it's abhorrent to have unmitigated hardware access in Ring0 just for a "security product", which has been the trend of every app wanting to plug in its own RootKit-like tool just for the sake of "app security" lately (League, NProtect, School Exam Spyware, etc.). There's a LOT of levels of "what the fuck have we let happen?" in play here, but I don't control those other levels. I'm responsible for C, so I'm gonna look at the C bit. Other people responsible for the other parts of this stack should, hopefully, take sincere responsibility for those parts. (I doubt it, though, lmao.)
@maegul
@lemmy.ml