2

Post

 3 years ago
source link: https://0x709394.me/Serde-tricks
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Preface

Recently, I and Mario are working on refactoring rspotify, trying to improve performance, documentation, error-handling, data model and reduce compile time, to make it easier to use. (For those who has never heard about rspotify, it is a Spotify HTTP SDK implemented in Rust).

I am partly focusing on polishing the data model, based on the issue created by Koxiaet . Since rspotify is API client for Spotify, it has to handle the request and response from Spotify HTTP API. Generally speaking, the data model is something about how to structure the response data, and used Serde to parse JSON response from HTTP API to Rust struct, and I have learnt a lot Serde tricks from refactoring.

Serde Lesson

Deserialize JSON map to Vec<T> based on its value.

An actions object which contains a disallows object, allows to update the user interface based on which playback actions are available within the current context.

The response JSON data from HTTP API:

{
	...
	"disallows": {
		"resuming": true
	}
	...
}

The original model representing actions was:

#[derive(Clone, Debug, Serialize, PartialEq, Eq)]
pub struct Actions {
    pub disallows: HashMap<DisallowKey, bool>
}

#[derive(Clone, Serialize, Deserialize, Copy, PartialEq, Eq, Debug, Hash, ToString)]
#[serde(rename_all = "snake_case")]
#[strum(serialize_all = "snake_case")]
pub enum DisallowKey {
    InterruptingPlayback,
    Pausing,
    Resuming,
	...
}

And Koxiaet gave a great advice about how to polish Actions:

Actions::disallows can be replaced with a Vec<DisallowKey> or HashSet<DisallowKey> by removing all entires whose value is false, which will result in a simpler API.

To be honest, I was not that familiar with Serde before, after digging in its official documentation for a while, it seems there is no a built-in way to convert JSON map to Vec<T> base on map's value.

After reading the Custom serialization from documentation, there was a simple solution came to my mind, so I wrote my first customized deserialize function.

I created an dumb Actions struct inside the deserialize function, and converted HashMap to Vec by filtering its value.

#[derive(Clone, Debug, Serialize, PartialEq, Eq)]
pub struct Actions {
    pub disallows: Vec<DisallowKey>,
}

impl<'de> Deserialize<'de> for Actions {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        #[derive(Deserialize)]
        struct OriginalActions {
            pub disallows: HashMap<DisallowKey, bool>,
        }

        let orignal_actions = OriginalActions::deserialize(deserializer)?;
        Ok(Actions {
            disallows: orignal_actions
                .disallows
                .into_iter()
                .filter(|(_, value)| *value)
                .map(|(key, _)| key)
                .collect(),
        })
    }
}

The types should be familiar if you've used Serde before.

If you're not used to Rust then the function signature will likely look a little bit strange. What it's trying to tell is that d will be something that implements Serde's Deserializer trait, and that any references to memory will live for the 'de lifetime.

Deserialize Unix milliseconds timestamp to Datetime<Utc>

An currently playing object which contains information about currently playing item, and the timestamp field is an integer, representing the Unix millisecond timestamp when data was fetched.

The response JSON data from HTTP API:

{
  ...
  "timestamp": 1490252122574,
  "progress_ms": 44272,
  "is_playing": true,
  "currently_playing_type": "track",
  "actions": {
    "disallows": {
      "resuming": true
    }
  }
  ...
}

The original model was:

/// Currently playing object
///
/// [Reference](https://developer.spotify.com/documentation/web-api/reference/player/get-the-users-currently-playing-track/)
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
    pub timestamp: u64,
    pub progress_ms: Option<u32>,
    pub is_playing: bool,
    pub item: Option<PlayingItem>,
    pub currently_playing_type: CurrentlyPlayingType,
    pub actions: Actions,
}

As before, Koxiaet made a great point about timestamp and progress_ms(I will talk about it later):

CurrentlyPlayingContext::timestamp should be a chrono::DateTime<Utc>, which could be easier to use.

The polished struct looks like:

#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
    pub context: Option<Context>,
    #[serde(
        deserialize_with = "from_millisecond_timestamp",
        serialize_with = "to_millisecond_timestamp"
    )]
    pub timestamp: DateTime<Utc>,
    pub progress_ms: Option<u32>,
    pub is_playing: bool,
    pub item: Option<PlayingItem>,
    pub currently_playing_type: CurrentlyPlayingType,
    pub actions: Actions,
}

Using the deserialize_with attribute tells Serde to use custom deserialization code for the timestamp field. The from_millisecond_timestamp code is:

/// Deserialize Unix millisecond timestamp to `DateTime<Utc>`
pub(in crate) fn from_millisecond_timestamp<'de, D>(d: D) -> Result<DateTime<Utc>, D::Error>
where
    D: de::Deserializer<'de>,
{
    d.deserialize_u64(DateTimeVisitor)
}

The code calls d.deserialize_u64 passing in a struct. The passed in struct implements Serde's Visitor, and look like:

// Vistor to help deserialize unix millisecond timestamp to `chrono::DateTime`
struct DateTimeVisitor;

impl<'de> de::Visitor<'de> for DateTimeVisitor {
    type Value = DateTime<Utc>;
    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        write!(
            formatter,
            "an unix millisecond timestamp represents DataTime<UTC>"
        )
    }
    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
	   ...
    }
}

The struct DateTimeVisitor doesn't have any fields, it just a type implemented the custom visitor which delegates to parse the u64.

Since there is no way to construct DataTime directly from Unix millisecond timestamp, I have to figure out how to handle the construction. And it turns out that there is a way to construct DateTime from seconds and nanoseconds:

use chrono::{DateTime, TimeZone, NaiveDateTime, Utc};

let dt = DateTime::<Utc>::from_utc(NaiveDateTime::from_timestamp(61, 0), Utc);

Thus, what I need to do is just convert millisecond to second and nanosecond:

fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
where
	E: de::Error,
{
	let second = (v - v % 1000) / 1000;
	let nanosecond = ((v % 1000) * 1000000) as u32;
	// The maximum value of i64 is large enough to hold millisecond, so it would be safe to convert it i64
	let dt = DateTime::<Utc>::from_utc(
		NaiveDateTime::from_timestamp(second as i64, nanosecond),
		Utc,
	);
	Ok(dt)
}

The to_millisecond_timestamp function is similar to from_millisecond_timestamp, but it's eaiser to implement, check this PR for more detail.

Deserialize milliseconds to Duration

The simplified episode object contains the simplified episode information, and the duration_ms field is an integer, which represents the episode length in milliseconds.

The response JSON data from HTTP API:

{
    ...
    "audio_preview_url" : "https://p.scdn.co/mp3-preview/83bc7f2d40e850582a4ca118b33c256358de06ff",
    "description" : "Följ med Tobias Svanelid till Sveriges äldsta tegelkyrka"
    "duration_ms" : 2685023,
    "explicit" : false,
	...
}

The original model was

#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct SimplifiedEpisode {
    pub audio_preview_url: Option<String>,
    pub description: String,
    pub duration_ms: u32,
	...
}

As before without saying, Koxiaet pointed out that

SimplifiedEpisode::duration_ms should be replaced with a duration of type Duration, since a built-in Duration type works better than primitive type.

Since I have worked with Serde's custome deserialization, it's not a hard job for me any more. I easily figure out how to deserialize u64 to Duration:

#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct SimplifiedEpisode {
    pub audio_preview_url: Option<String>,
    pub description: String,
    #[serde(
        deserialize_with = "from_duration_ms",
        serialize_with = "to_duration_ms",
        rename = "duration_ms"
    )]
    pub duration: Duration,
	...
}

/// Vistor to help deserialize duration represented as millisecond to `std::time::Duration`
struct DurationVisitor;
impl<'de> de::Visitor<'de> for DurationVisitor {
    type Value = Duration;
    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        write!(formatter, "a milliseconds represents std::time::Duration")
    }
    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(Duration::from_millis(v))
    }
}

/// Deserialize `std::time::Duration` from millisecond(represented as u64)
pub(in crate) fn from_duration_ms<'de, D>(d: D) -> Result<Duration, D::Error>
where
    D: de::Deserializer<'de>,
{
    d.deserialize_u64(DurationVisitor)
}

Now, the life is easier than before.

Deserialize milliseconds to Option<Duration>

Let's go back to CurrentlyPlayingContext model, since we have replaced millisecond (represents as u32) with Duration, it makes sense to replace all millisecond fields to Duration. But hold on, it seems progress_ms field is a little bit different.

The progress_ms field is either not present or a millisecond, the u32 handles the milliseconds, as it's value might not be present in the response, it's an Option<u32>, so it won't work with from_duration_ms.

Thus, it's necessary to figure out how to handle the Option type, and the answer is in the documentation, the deserialize_option function:

Hint that the Deserialize type is expecting an optional value.

This allows deserializers that encode an optional value as a nullable value to convert the null value into None and a regular value into Some(value).

#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct CurrentlyPlayingContext {
    pub context: Option<Context>,
    #[serde(
        deserialize_with = "from_millisecond_timestamp",
        serialize_with = "to_millisecond_timestamp"
    )]
    pub timestamp: DateTime<Utc>,
    #[serde(default)]
    #[serde(
        deserialize_with = "from_option_duration_ms",
        serialize_with = "to_option_duration_ms",
        rename = "progress_ms"
    )]
    pub progress: Option<Duration>,
}

/// Deserialize `Option<std::time::Duration>` from millisecond(represented as u64)
pub(in crate) fn from_option_duration_ms<'de, D>(d: D) -> Result<Option<Duration>, D::Error>
where
    D: de::Deserializer<'de>,
{
    d.deserialize_option(OptionDurationVisitor)
}

As before, the OptionDurationVisitor is an empty struct implemented Visitor trait, but key point is in order to work with deserialize_option, the OptionDurationVisitor has to implement the visit_none and visit_some method:

impl<'de> de::Visitor<'de> for OptionDurationVisitor {
    type Value = Option<Duration>;
    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        write!(
            formatter,
            "a optional milliseconds represents std::time::Duration"
        )
    }
    fn visit_none<E>(self) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(None)
    }

    fn visit_some<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: de::Deserializer<'de>,
    {
        Ok(Some(deserializer.deserialize_u64(DurationVisitor)?))
    }
}

The visit_none method return Ok(None) so the progress value in the struct will be None, and the visit_some delegates the parsing logic to DurationVisitor via the deserialize_u64 call, so deserializing Some(u64) works like the u64.

Summary

To be honest, it's the first time I have needed some customized works, which took me some time to understand how does Serde works. Finally, all investments paid off, it works great now.

Serde is such an awesome deserialize/serialize framework which I have learnt a lot of from and still have a lot of to learn from.

Reference

Gitalking ...


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK