Introduction
Wherever there is a design pattern there is also an underlying can of worms in search for a solution. The New Type pattern can be defined easily by 2 entities: a wrapper type and a wrapped type.
A very simple example of that would be wrapping an numeric type representing the age of a person like so:
1
2
3
4
pub struct Age(i32);
pub fn is_user_allowed(age: i32) {...}
pub fn is_user_allowed(age: Age {...}
So what have we accomplished with this? List of benefits incoming (Spoiler: the last 4 are specific to Rust).
Well right off the bat our API would be more verbose; check the difference in function signature above:
It represents a lightweigth way of achieving encapsulation. Users of your API don’t need to know your internal data structures; also you can expose publicly only some behaviour of the underlying data structure.
Keep your API backwards compatible: By wrapping a type you can change the implementation details.
The wrapper and wrapped types are not compatible, which allows you to leverage the type system to impose certain restrictions on the caller.
Introduce move semantics on types with copy semantics.
You can create new types with different life times or mutability. E.g. wrapping a mutable reference, ensuring that memory location cannot be modified outside the new type’s defined methods.
Bypass the orphan rule (You can implement foreign traits for foreign types like this).
And of course the most important, straight out “the Rust book”. “There is no runtime performance penalty for using this pattern, and the wrapper type is elided at compile time.”
Make invalid state irepresentable
This is one of the best practices of type driven development, encode constraints of the domain you are modeling inside the type system and rely on the compiler to enforce them.
We will be looking at validating an email, which is a common task for applications that will send you an email notification on certain actions. Imagine your usecase would be to update the email address of a user.
We can write a validation function as below, but at what cost?
- You cannot ensure it holds the email invariants at all times (being non empty, longer than 4 characters, regex valid etc), because the email is actually just a String with no constraints.
- Can you tell if the validation function is called by all caller sites of your email_update() method? You will have to call it inside the function just to be sure, but still other parts of your app may use the email wrongfully.
- What happens if your update_email function (stupid example) grows too big and you have to split it, now you have to validate the email in all places.
- The worst part is that calling the validation method everytime you need it does not store this info anywhere, the function performs the validation and returns a boolean, all the information is completely lost.
1
2
3
4
5
6
7
8
9
10
11
12
13
pub fn email_is_valid(email: String) -> bool {
if email.is_empty() {
return false;
}
if email.len() < 4 {
return false;
}
// Perform some regex validation here
true
}
Now let’s look at parsing. Parsing by definition is just a way of consuming unstructured data and produce a more structured result. We are wrapping our email String into the Email type which comes with all the benefits listed in the previous section. But let me list some more for this specific example:
The inner type is now private, so nobody can missuse the email, they have to go through our Email type “guard”.
Now you may think parse is just a copy of the validation function above but the return type is different, it gives back a valid Email.
Moreover parse() is the only way of creating an Email object which drops invalid states right off the bat. We’ve made it impossible to violate the Email invariants/constraints in any way.Error handling will be the subject of a future post, but please take a moment to appreciate the robustness and beauty of Rust’s error model which forces you to deal with the possible errors on the caller side through it’s Result type. Rather than just use the infamous try catch with the catch all clause just in case 🤢.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
use thiserror::Error;
/// The inner value is not public, and not accessible directly by users.
pub struct Email(String);
impl Email {
pub fn parse(email: String) -> Result<Email, EmailError> {
if email.trim().is_empty() {
return Err(EmailError::EmptyEmail);
}
if email.len() < 4 {
return Err(EmailError::InvalidEmailLength);
}
// Perform some regex validation here
Ok(Email(email))
}
}
#[derive(Error, Debug)]
pub enum EmailError {
#[error("The email cannot be empty")]
EmptyEmail,
#[error("The length of you email address is invalid")]
InvalidEmailLength,
#[error("The email address is invalid")]
InvalidRegex,
}
Conclusion
We have tapped into type-driven-development to make invalid state unrepresentable by construction. Rust did not invent this topic but it has an expressive type system that allows you to create more robust backends than you normally would.
Documentation
You can find more details in the following resources:
- Zero to production book chapter 6 and 7
- Parse don’t validate
- The Rust book chapter 19
Disclaimer: This blog post summarises some aspects of the “New Type” pattern in Rust. It represents my personal notes from various sources such as “The Rust Programming Language” book or various internet articles. No copyright intended.