Post
source link: https://0x709394.me/Serde-tricks
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Preface
Recently, I and Mario are working on refactoring rspotify
, trying to improve performance, documentation, error-handling, data model and reduce compile time, to make it easier to use. (For those who has never heard about rspotify
, it is a Spotify HTTP SDK implemented in Rust).
I am partly focusing on polishing the data model, based on the issue created by Koxiaet . Since rspotify
is API client for Spotify, it has to handle the request and response from Spotify HTTP API. Generally speaking, the data model is something about how to structure the response data, and used Serde
to parse JSON response from HTTP API to Rust struct
, and I have learnt a lot Serde tricks from refactoring.
Serde Lesson
Deserialize JSON map to Vec<T> based on its value.
An actions object which contains a disallows
object, allows to update the user interface based on which playback actions are available within the current context.
The response JSON data from HTTP API:
{
...
"disallows": {
"resuming": true
}
...
}
The original model representing actions was:
#[derive(Clone, Debug, Serialize, PartialEq, Eq)]
pub struct Actions {
pub disallows: HashMap<DisallowKey, bool>
}
#[derive(Clone, Serialize, Deserialize, Copy, PartialEq, Eq, Debug, Hash, ToString)]
#[serde(rename_all = "snake_case")]
#[strum(serialize_all = "snake_case")]
pub enum DisallowKey {
InterruptingPlayback,
Pausing,
Resuming,
...
}
And Koxiaet gave a great advice about how to polish Actions
:
Actions::disallows
can be replaced with aVec<DisallowKey>
orHashSet<DisallowKey>
by removing all entires whose value is false, which will result in a simpler API.
To be honest, I was not that familiar with Serde
before, after digging in its official documentation for a while, it seems there is no a built-in way to convert JSON map to Vec<T>
base on map's value.
After reading the Custom serialization from documentation, there was a simple solution came to my mind, so I wrote my first customized deserialize function.
I created an dumb Actions
struct inside the deserialize
function, and converted HashMap
to Vec
by filtering its value.
#[derive(Clone, Debug, Serialize, PartialEq, Eq)]
pub struct Actions {
pub disallows: Vec<DisallowKey>,
}
impl<'de> Deserialize<'de> for Actions {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
#[derive(Deserialize)]
struct OriginalActions {
pub disallows: HashMap<DisallowKey, bool>,
}
let orignal_actions = OriginalActions::deserialize(deserializer)?;
Ok(Actions {
disallows: orignal_actions
.disallows
.into_iter()
.filter(|(_, value)| *value)
.map(|(key, _)| key)
.collect(),
})
}
}
The types should be familiar if you've used Serde
before.
If you're not used to Rust then the function signature will likely look a little bit strange. What it's trying to tell is that d will be something that implements Serde
's Deserializer
trait, and that any references to memory will live for the 'de
lifetime.
Deserialize Unix milliseconds timestamp to Datetime<Utc>
An currently playing object which contains information about currently playing item, and the timestamp
field is an integer, representing the Unix millisecond timestamp when data was fetched.
The response JSON data from HTTP API:
{
...
"timestamp": 1490252122574,
"progress_ms": 44272,
"is_playing": true,
"currently_playing_type": "track",
"actions": {
"disallows": {
"resuming": true
}
}
...
}
The original model was:
/// Currently playing object
///
/// [Reference](https://developer.spotify.com/documentation/web-api/reference/player/get-the-users-currently-playing-track/)
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
pub timestamp: u64,
pub progress_ms: Option<u32>,
pub is_playing: bool,
pub item: Option<PlayingItem>,
pub currently_playing_type: CurrentlyPlayingType,
pub actions: Actions,
}
As before, Koxiaet made a great point about timestamp
and progress_ms
(I will talk about it later):
CurrentlyPlayingContext::timestamp
should be achrono::DateTime<Utc>
, which could be easier to use.
The polished struct looks like:
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
pub context: Option<Context>,
#[serde(
deserialize_with = "from_millisecond_timestamp",
serialize_with = "to_millisecond_timestamp"
)]
pub timestamp: DateTime<Utc>,
pub progress_ms: Option<u32>,
pub is_playing: bool,
pub item: Option<PlayingItem>,
pub currently_playing_type: CurrentlyPlayingType,
pub actions: Actions,
}
Using the deserialize_with
attribute tells Serde
to use custom deserialization code for the timestamp
field. The from_millisecond_timestamp
code is:
/// Deserialize Unix millisecond timestamp to `DateTime<Utc>`
pub(in crate) fn from_millisecond_timestamp<'de, D>(d: D) -> Result<DateTime<Utc>, D::Error>
where
D: de::Deserializer<'de>,
{
d.deserialize_u64(DateTimeVisitor)
}
The code calls d.deserialize_u64
passing in a struct. The passed in struct implements Serde
's Visitor
, and look like:
// Vistor to help deserialize unix millisecond timestamp to `chrono::DateTime`
struct DateTimeVisitor;
impl<'de> de::Visitor<'de> for DateTimeVisitor {
type Value = DateTime<Utc>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(
formatter,
"an unix millisecond timestamp represents DataTime<UTC>"
)
}
fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
where
E: de::Error,
{
...
}
}
The struct DateTimeVisitor
doesn't have any fields, it just a type implemented the custom visitor which delegates to parse the u64
.
Since there is no way to construct DataTime
directly from Unix millisecond timestamp, I have to figure out how to handle the construction. And it turns out that there is a way to construct DateTime
from seconds and nanoseconds:
use chrono::{DateTime, TimeZone, NaiveDateTime, Utc};
let dt = DateTime::<Utc>::from_utc(NaiveDateTime::from_timestamp(61, 0), Utc);
Thus, what I need to do is just convert millisecond to second and nanosecond:
fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
where
E: de::Error,
{
let second = (v - v % 1000) / 1000;
let nanosecond = ((v % 1000) * 1000000) as u32;
// The maximum value of i64 is large enough to hold millisecond, so it would be safe to convert it i64
let dt = DateTime::<Utc>::from_utc(
NaiveDateTime::from_timestamp(second as i64, nanosecond),
Utc,
);
Ok(dt)
}
The to_millisecond_timestamp
function is similar to from_millisecond_timestamp
, but it's eaiser to implement, check this PR for more detail.
Deserialize milliseconds to Duration
The simplified episode object contains the simplified episode information, and the duration_ms
field is an integer, which represents the episode length in milliseconds.
The response JSON data from HTTP API:
{
...
"audio_preview_url" : "https://p.scdn.co/mp3-preview/83bc7f2d40e850582a4ca118b33c256358de06ff",
"description" : "Följ med Tobias Svanelid till Sveriges äldsta tegelkyrka"
"duration_ms" : 2685023,
"explicit" : false,
...
}
The original model was
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct SimplifiedEpisode {
pub audio_preview_url: Option<String>,
pub description: String,
pub duration_ms: u32,
...
}
As before without saying, Koxiaet pointed out that
SimplifiedEpisode::duration_ms
should be replaced with aduration
of typeDuration
, since a built-inDuration
type works better than primitive type.
Since I have worked with Serde
's custome deserialization, it's not a hard job for me any more. I easily figure out how to deserialize u64
to Duration
:
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct SimplifiedEpisode {
pub audio_preview_url: Option<String>,
pub description: String,
#[serde(
deserialize_with = "from_duration_ms",
serialize_with = "to_duration_ms",
rename = "duration_ms"
)]
pub duration: Duration,
...
}
/// Vistor to help deserialize duration represented as millisecond to `std::time::Duration`
struct DurationVisitor;
impl<'de> de::Visitor<'de> for DurationVisitor {
type Value = Duration;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(formatter, "a milliseconds represents std::time::Duration")
}
fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
where
E: de::Error,
{
Ok(Duration::from_millis(v))
}
}
/// Deserialize `std::time::Duration` from millisecond(represented as u64)
pub(in crate) fn from_duration_ms<'de, D>(d: D) -> Result<Duration, D::Error>
where
D: de::Deserializer<'de>,
{
d.deserialize_u64(DurationVisitor)
}
Now, the life is easier than before.
Deserialize milliseconds to Option<Duration>
Let's go back to CurrentlyPlayingContext
model, since we have replaced millisecond (represents as u32
) with Duration
, it makes sense to replace all millisecond fields to Duration
. But hold on, it seems progress_ms
field is a little bit different.
The progress_ms
field is either not present or a millisecond, the u32
handles the milliseconds, as it's value might not be present in the response, it's an Option<u32>
, so it won't work with from_duration_ms
.
Thus, it's necessary to figure out how to handle the Option
type, and the answer is in the documentation, the deserialize_option
function:
Hint that the
Deserialize
type is expecting an optional value.
This allows deserializers that encode an optional value as a nullable value to convert the null value into
None
and a regular value intoSome(value)
.
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
pub context: Option<Context>,
#[serde(
deserialize_with = "from_millisecond_timestamp",
serialize_with = "to_millisecond_timestamp"
)]
pub timestamp: DateTime<Utc>,
#[serde(default)]
#[serde(
deserialize_with = "from_option_duration_ms",
serialize_with = "to_option_duration_ms",
rename = "progress_ms"
)]
pub progress: Option<Duration>,
}
/// Deserialize `Option<std::time::Duration>` from millisecond(represented as u64)
pub(in crate) fn from_option_duration_ms<'de, D>(d: D) -> Result<Option<Duration>, D::Error>
where
D: de::Deserializer<'de>,
{
d.deserialize_option(OptionDurationVisitor)
}
As before, the OptionDurationVisitor
is an empty struct implemented Visitor
trait, but key point is in order to work with deserialize_option
, the OptionDurationVisitor
has to implement the visit_none
and visit_some
method:
impl<'de> de::Visitor<'de> for OptionDurationVisitor {
type Value = Option<Duration>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(
formatter,
"a optional milliseconds represents std::time::Duration"
)
}
fn visit_none<E>(self) -> Result<Self::Value, E>
where
E: de::Error,
{
Ok(None)
}
fn visit_some<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: de::Deserializer<'de>,
{
Ok(Some(deserializer.deserialize_u64(DurationVisitor)?))
}
}
The visit_none
method return Ok(None)
so the progress
value in the struct will be None, and the visit_some
delegates the parsing logic to DurationVisitor
via the deserialize_u64
call, so deserializing Some(u64)
works like the u64
.
Summary
To be honest, it's the first time I have needed some customized works, which took me some time to understand how does Serde
works. Finally, all investments paid off, it works great now.
Serde is such an awesome deserialize/serialize framework which I have learnt a lot of from and still have a lot of to learn from.
Reference
Gitalking ...
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK