A short information for Elixir builders
Jason is an actively maintained JSON parser and generator in Elixir. Should you consider dealing with JSON buildings to ship to your purchasers or simply shepherding knowledge round your Elixir utility, you’re more likely to attain for this or Poison — of which the contents of this publish are related to too.
At Multiverse, we’re engaged on growing a brand new model of current third-party integration and taking it in-house as a part of our core platform. The characteristic in query isn’t essential. It’s simply essential to know that there’s quite a lot of knowledge — suppose thousands and thousands of information.
One drawback is that we don’t have direct entry to the database of the prevailing system — what we do have is the flexibility to export the info right into a CSV!
Fortunately there are solely eight or so fields for every file, so we will resort to a superb, old school CSV import to deliver the info over.
Easy, proper? Unsuitable.
Importing a million and extra rows of knowledge in PSQL isn’t too dangerous, however it is going to decelerate and probably lock up the remainder of our utility whereas we do the import — and that’s dangerous! We additionally want to remodel the info right into a extra affordable knowledge construction for our new implementation — the previous model had some inconsistencies and sort points we’d prefer to put off. For instance, dates are represented in epoch time and un-required knowledge fields.
So now we have to rework and import every file, which can take time. To mitigate this, we are going to use RabbitMQ, an open supply message dealer. By packaging every of the rows from our CSV right into a JSON message, asynchronously firing them off as messages, and letting our RabbitMQ client take care of them as they arrive in concurrently.
Should you’re inquisitive about studying extra about Rabbit, go away a remark.
So, the place does Jason encoding are available?
Keep in mind that we need to rework our knowledge. The difficulty with exporting all the pieces right into a CSV file is that once we learn it into our utility, we’ll be studying all the pieces as a string. Changing our date fields now includes us manually calling
String.to_integer/2 , which is one thing we don’t need to must do manually. Think about we had 100 fields to import!
programmer is a lazy programmer.
defmodule LegacyLog do
defstruct [:id, :user_id, :legacy_log_id, :time, :notes, :date, :inserted_at, :updated_at]
All we’ve got right here is a straightforward Elixir struct with fields for every of the information we need to deal with and parse from the CSV import.
Utilizing it in our CSV import:
Sure, we might rework the info right here as we import it, however good RabbitMQ practise includes pretending that our messages are coming from someplace exterior we don’t have direct entry to — despite the fact that we’re producing and consuming them ourselves on this occasion.
The bit you’re actually right here for.
We have to make the most of the best way
Jason has outlined
Protocols to outline a customized implementation on methods to deal with our encoding.
Following on from our beforehand outlined struct, we’re going so as to add a
defimpl for the struct itself and specify what we need to do:
Within the above code, I’m defining an implementation for the kind
LegacyLog that means when a
Jason.Encode/3 name comes its means, it is aware of to name my customized code. Some other encoding perform name will use the default implementation (which is ok for many use circumstances).
The customized code itself is:
- Popping the one area I need to stay a string out of the Struct and holding the remainder
- Changing the remaining
Structright into a
- Iterating by every area and altering them into integer values
- Placing the notes again
- Doing the
Jason.Encodewhich ensures legitimate JSON.